Anyone know Python or Anaconda?

Mierin · Nov 1, 2016

No Sir Mix a Lot references or Out of Context Quotes tomfoolery please.

(Ok, that was really the only reason I mentioned anaconda).

zoogs · Nov 1, 2016

By "know", is "just starting to use it" enough?

What's the question?

teachercd · Nov 1, 2016

Yep...in my pants

C N Red · Nov 1, 2016

I believe they are snakes. Python of the family Pythonidae and genus Python. Anaconda of the family Boidae and genus Eunectes.

Mierin · Nov 1, 2016

zoogs said:
What's the question?

I've got about 1,200 different categories, which I've put into 1,200 dataframes (not sure if that's a good way to go about this, but it's what I did).

Example:

LastName FirstName

CD Teacher

Zoogs Bob

Damodred Moiraine

Damodred Alastair

Zoogs Larry

Zoogs Bob

where LastName is the category and I want to do "stuff" to the first names. By stuff I mean lots of different operations. There's a different dataframe for each last name. I want to do the same "stuff" to the first names for every LastName. I just can't figure out how to make a loop that will go through multiple dataframes.

In basic English the loop would be:

For each last name:

Do these 30 fancy things to the first names

Not sure why I just now thought of this but once I put them into dataframes I could delete the last name column. That might help uncomplicate things for me. I started learning Python a week ago. There are about 3 other columns in the data set.

teachercd · Nov 1, 2016

Uggg... it's not teacher!

How has no one got this yet?

Also, Zoogs first name is Bob? Sweet!

zoogs · Nov 1, 2016

Bob Zoogs, that's me.

Do you use R, by any chance? "Dataframes" sounds rather R-like.

It *sounds* like which category it is determines what exactly you'll be doing to them. I don't completely see why there's one dataframe per category, rather than simply having (for example) the LastName variable be a category.

I apologize, since if this is a python syntax question I probably can't answer. Generally speaking, I guess I'd have a list of all the dataframes (in python, everything's technically a pointer, right? So this shouldn't be too costly?) and iterate through the list. In R you'd probably use some form of lapply family. Pseudocode wise

for(i = 0; i < length(dfList); i++) {
doTheFancyThings();
}
Where:

Code:

doTheFancyThings(fancyThingType) {
  // if the thirty things depend on the category...
  switch(fancyThingType) {
    case 'zoogs':
        return doAwesomeThings();
        break;
    case 'Damodred':
        return doDredfulThings();
        break;
    default:
        return banTeach();
        break;
    }
}

...although I think you basically have that part already, so I'm not sure this has helped

Sorry if that was super basic. I should say I barely know python. I looked up looping over dataframes and saw something about pandas. [Technically, I think I only got stuff about looping through *one* dataframe]. Heh. Seems like a fun language!

R should be pretty well suited to something like this. The methodology is Split/Apply/Combine but that kind of supposes you have everything (including the LastName field) in *one* dataframe. Have it as a factor, and then I think it's tapply your way through that. I'm not super familiar with dplyr, but I'm sure that would provide an even easier grammar for the operation.

Mierin · Nov 1, 2016

teachercd said:
Uggg... it's not teacher!

How has no one got this yet?

Also, Zoogs first name is Bob? Sweet!

Your first name isn't Teacher?

WHHHHHHHHHHHHHHHHHHHATTTTTT?

teachercd · Nov 1, 2016

Moiraine said:
teachercd said:

Uggg... it's not teacher!

How has no one got this yet?

Also, Zoogs first name is Bob? Sweet!

Click to expand...

Your first name isn't Teacher?

WHHHHHHHHHHHHHHHHHHHATTTTTT?

Tea!

Mierin · Nov 1, 2016

zoogs said:
Bob Zoogs, that's me.

Do you use R, by any chance? "Dataframes" sounds rather R-like.

It *sounds* like which category it is determines what exactly you'll be doing to them. I don't completely see why there's one dataframe per category, rather than simply having (for example) the LastName variable be a category.

I apologize, since if this is a python syntax question I probably can't answer. Generally speaking, I guess I'd have a list of all the dataframes (in python, everything's technically a pointer, right? So this shouldn't be too costly?) and iterate through the list. In R you'd probably use some form of lapply family. Pseudocode wise

for(i = 0; i < length(dfList); i++) {
doTheFancyThings();
}
Where:

Code:

doTheFancyThings(fancyThingType) { // if the thirty things depend on the category... switch(fancyThingType) { case 'zoogs': return doAwesomeThings(); break; case 'Damodred': return doDredfulThings(); break; default: return banTeach(); break; } }

...although I think you basically have that part already, so I'm not sure this has helped

Sorry if that was super basic. I should say I barely know python. I looked up looping over dataframes and saw something about pandas. [Technically, I think I only got stuff about looping through *one* dataframe]. Heh. Seems like a fun language!

Yes I know R. Not super well but better than Python. But the things I need to do need to be done in Python.

The reason the last names need to be dataframes is due to memory. From what I've read the dataset I have will be too big to keep it all together while I'm doing these things to it, so I'm going to merge them back together in the end. Also, I know how to do the fancy things to the names individually. I need the loop to go through all of them instead of naming the 1,200 names. Anyhow... I will probably pester the people of stackoverflow again.

knapplc · Nov 1, 2016

zoogs said:
Bob Zoogs, that's me.

Do you use R, by any chance? "Dataframes" sounds rather R-like.

It *sounds* like which category it is determines what exactly you'll be doing to them. I don't completely see why there's one dataframe per category, rather than simply having (for example) the LastName variable be a category.

I apologize, since if this is a python syntax question I probably can't answer. Generally speaking, I guess I'd have a list of all the dataframes (in python, everything's technically a pointer, right? So this shouldn't be too costly?) and iterate through the list. In R you'd probably use some form of lapply family. Pseudocode wise

for(i = 0; i < length(dfList); i++) {
doTheFancyThings();
}
Where:

Code:

doTheFancyThings(fancyThingType) { // if the thirty things depend on the category... switch(fancyThingType) { case 'zoogs': return doAwesomeThings(); break; case 'Damodred': return doDredfulThings(); break; default: return banTeach(); break; } }

...although I think you basically have that part already, so I'm not sure this has helped

Sorry if that was super basic. I should say I barely know python. I looked up looping over dataframes and saw something about pandas. [Technically, I think I only got stuff about looping through *one* dataframe]. Heh. Seems like a fun language!

R should be pretty well suited to something like this. The methodology is Split/Apply/Combine but that kind of supposes you have everything (including the LastName field) in *one* dataframe. Have it as a factor, and then I think it's tapply your way through that. I'm not super familiar with dplyr, but I'm sure that would provide an even easier grammar for the operation.

zoogs · Nov 1, 2016

Moiraine said:
The reason the last names need to be dataframes is due to memory. From what I've read the dataset I have will be too big to keep it all together while I'm doing these things to it, so I'm going to merge them back together in the end. Also, I know how to do the fancy things to the names individually. I need the loop to go through all of them instead of naming the 1,200 names. Anyhow... I will probably pester the people of stackoverflow again.

Oh, wow, that's interesting. I'm not sure how memory issues work in python. Please keep us posted, as I'll be curious to see the solution!

Also, does this help?

http://stackoverflow.com/questions/36601956/how-can-i-iterate-through-multiple-dataframes-to-select-a-column-in-each-in-pyth

They have

for name in dfList:
Or: pandas http://pandas.pydata.org/pandas-docs/stable/groupby.html
(I can't tell if memory will play a factor there in your case. If you already have the data manually split out, is it not possible to have that in a list and loop over it?)

Mierin · Nov 1, 2016

zoogs said:
Moiraine said:

The reason the last names need to be dataframes is due to memory. From what I've read the dataset I have will be too big to keep it all together while I'm doing these things to it, so I'm going to merge them back together in the end. Also, I know how to do the fancy things to the names individually. I need the loop to go through all of them instead of naming the 1,200 names. Anyhow... I will probably pester the people of stackoverflow again.

Click to expand...

Oh, wow, that's interesting. I'm not sure how memory issues work in python. Please keep us posted, as I'll be curious to see the solution!

Also, does this help?

http://stackoverflow.com/questions/36601956/how-can-i-iterate-through-multiple-dataframes-to-select-a-column-in-each-in-pyth

They have

for name in dfList:
Or: pandas http://pandas.pydata.org/pandas-docs/stable/groupby.html
(I can't tell if memory will play a factor there in your case. If you already have the data manually split out, is it not possible to have that in a list and loop over it?)

I've actually been to that post and since Python is so new to me it doesn't make a lot of sense and I can't really translate it to what I want to do. Pandas is what I'm using.

huKSer · Nov 1, 2016

Count me out

I did Fortran at UNL using punch cards.

Anyone know Python or Anaconda?

Mierin

Donor

zoogs

New member

teachercd

Active member

C N Red

New member

Mierin

Donor

teachercd

Active member

zoogs

New member

Mierin

Donor

teachercd

Active member

Mierin

Donor

knapplc

Active member

zoogs

New member

Mierin

Donor

huKSer

New member