I've got about 1,200 different categories, which I've put into 1,200 dataframes (not sure if that's a good way to go about this, but it's what I did).What's the question?
doTheFancyThings(fancyThingType) {
// if the thirty things depend on the category...
switch(fancyThingType) {
case 'zoogs':
return doAwesomeThings();
break;
case 'Damodred':
return doDredfulThings();
break;
default:
return banTeach();
break;
}
}
Uggg... it's not teacher!
How has no one got this yet?
Also, Zoogs first name is Bob? Sweet!
Tea!Your first name isn't Teacher?Uggg... it's not teacher!
How has no one got this yet?
Also, Zoogs first name is Bob? Sweet!
WHHHHHHHHHHHHHHHHHHHATTTTTT?
Bob Zoogs, that's me.
Do you use R, by any chance? "Dataframes" sounds rather R-like.
It *sounds* like which category it is determines what exactly you'll be doing to them. I don't completely see why there's one dataframe per category, rather than simply having (for example) the LastName variable be a category.
I apologize, since if this is a python syntax question I probably can't answer. Generally speaking, I guess I'd have a list of all the dataframes (in python, everything's technically a pointer, right? So this shouldn't be too costly?) and iterate through the list. In R you'd probably use some form of lapply family. Pseudocode wise
for(i = 0; i < length(dfList); i++) {
doTheFancyThings();
}
Where:
...although I think you basically have that part already, so I'm not sure this has helpedCode:doTheFancyThings(fancyThingType) { // if the thirty things depend on the category... switch(fancyThingType) { case 'zoogs': return doAwesomeThings(); break; case 'Damodred': return doDredfulThings(); break; default: return banTeach(); break; } }
Sorry if that was super basic. I should say I barely know python. I looked up looping over dataframes and saw something about pandas. [Technically, I think I only got stuff about looping through *one* dataframe]. Heh. Seems like a fun language!![]()
![]()
Bob Zoogs, that's me.
Do you use R, by any chance? "Dataframes" sounds rather R-like.
It *sounds* like which category it is determines what exactly you'll be doing to them. I don't completely see why there's one dataframe per category, rather than simply having (for example) the LastName variable be a category.
I apologize, since if this is a python syntax question I probably can't answer. Generally speaking, I guess I'd have a list of all the dataframes (in python, everything's technically a pointer, right? So this shouldn't be too costly?) and iterate through the list. In R you'd probably use some form of lapply family. Pseudocode wise
for(i = 0; i < length(dfList); i++) {
doTheFancyThings();
}
Where:
...although I think you basically have that part already, so I'm not sure this has helpedCode:doTheFancyThings(fancyThingType) { // if the thirty things depend on the category... switch(fancyThingType) { case 'zoogs': return doAwesomeThings(); break; case 'Damodred': return doDredfulThings(); break; default: return banTeach(); break; } }
Sorry if that was super basic. I should say I barely know python. I looked up looping over dataframes and saw something about pandas. [Technically, I think I only got stuff about looping through *one* dataframe]. Heh. Seems like a fun language!![]()
![]()
R should be pretty well suited to something like this. The methodology is Split/Apply/Combine but that kind of supposes you have everything (including the LastName field) in *one* dataframe. Have it as a factor, and then I think it's tapply your way through that. I'm not super familiar with dplyr, but I'm sure that would provide an even easier grammar for the operation.
Oh, wow, that's interesting. I'm not sure how memory issues work in python. Please keep us posted, as I'll be curious to see the solution!The reason the last names need to be dataframes is due to memory. From what I've read the dataset I have will be too big to keep it all together while I'm doing these things to it, so I'm going to merge them back together in the end. Also, I know how to do the fancy things to the names individually. I need the loop to go through all of them instead of naming the 1,200 names. Anyhow... I will probably pester the people of stackoverflow again.
Oh, wow, that's interesting. I'm not sure how memory issues work in python. Please keep us posted, as I'll be curious to see the solution!The reason the last names need to be dataframes is due to memory. From what I've read the dataset I have will be too big to keep it all together while I'm doing these things to it, so I'm going to merge them back together in the end. Also, I know how to do the fancy things to the names individually. I need the loop to go through all of them instead of naming the 1,200 names. Anyhow... I will probably pester the people of stackoverflow again.
Also, does this help?
http://stackoverflow.com/questions/36601956/how-can-i-iterate-through-multiple-dataframes-to-select-a-column-in-each-in-pyth
They have
for name in dfList:
Or: pandas http://pandas.pydata.org/pandas-docs/stable/groupby.html
(I can't tell if memory will play a factor there in your case. If you already have the data manually split out, is it not possible to have that in a list and loop over it?)