Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add merge_households parameter #31

Open
2 of 8 tasks
rafapereirabr opened this issue Sep 30, 2023 · 3 comments
Open
2 of 8 tasks

add merge_households parameter #31

rafapereirabr opened this issue Sep 30, 2023 · 3 comments

Comments

@rafapereirabr
Copy link
Member

rafapereirabr commented Sep 30, 2023

add a merge_households (logical) parameter to indicate whether the function should merge household variables to the output data.

  • 1970 population
  • 1980 population
  • 1991 population
  • 2000 population
  • 2000 families
  • 2010 population
  • 2010 emigration
  • 2010 mortality
@rafapereirabr
Copy link
Member Author

rafapereirabr commented Nov 18, 2023

In the year 2010, the weight variable has the same code V0010 in all data sets (households, population, mortality and emmigration). In this case, will a suffix. So it would look like this:

library(dplyr)
library(censobr)

mort <- censobr::read_population(year = 2010, as_data_frame = F)
hous <- censobr::read_households(year = 2010, as_data_frame = F)

# rename columns
mort <- dplyr::rename_with(mort,
                           ~ paste0(.x, '_mort', recycle0 = TRUE),
                           starts_with("V0010"))

hous <- dplyr::rename_with(hous,
                           ~ paste0(.x, '_hous', recycle0 = TRUE),
                           starts_with("V0010"))

# merge
df <- left_join(hous, mort)

names(df)  

I haven't checked for the other years yet. What do you think, @antrologos?

@rafapereirabr
Copy link
Member Author

This has now been implemented in 19b0799.

library(censobr)

df <- read_mortality(year = 2010, 
                     merge_households = TRUE)

df <- read_population(year = 2000,
                      merge_households = TRUE)

@rafapereirabr
Copy link
Member Author

Still facing memory challenges to implement this parameter in functions what work with large data sets, like population and families. So I'm reopening this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant