- 1.0Impact Factor
- 1.8CiteScore
- 23 daysTime to First Decision
Stats, Volume 3, Issue 4
2020 December - 7 articles
Cover Story: The analysis of massive databases is a key issue for most applications today, and the use of parallel computing techniques is one of the suitable approaches for that. One way to perform statistical analyses over massive databases is combining some tools via the sparklyr package, which allows for an R application to use Apache Spark as a framework. This paper presents an analysis of Brazilian public data from the Bolsa Família Programme (BFP—conditional cash transfer), comprising a local processing of a large data set with 1.26 billion observations which total more than 100 GB. Our goal was to understand how this social program acts in different cities, as well as to identify potentially important variables to BFP utilization rate. The analysis was performed with RF and indicated the high importance of some variables such as family income, education, occupation, and density of people in the homes. View this paper.
- Issues are regarded as officially published after their release is announced to the table of contents alert mailing list .
- You may sign up for email alerts to receive table of contents of newly released issues.
- PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Articles
There are no articles in this issue yet.

