Gallery

rootpath

Explore

root

Garden

DataFrames

The DataFrame is like a super data structure, holding other data structures inside of it.

Construction

Think of some data that you can put into sequences of containers (a.k.a. rows of columns)

3 x 4 dataset:

There are no rows in this table

⁠

DataFrames are made from rows and columns. The rows and columns have a list-like structure, and can be

constructed with lists. Create an empty DataFrame, and give it a column and a list. Then, create a another, and

add it to the first.

Code DataFrame: sql, my project,

garden = root.garden

quark['type']=[moonflower1, moonflower2]

type

0 moonflower1

1 moonflower2

quarkFeatures = pd.DataFrame(

columns=['charge', 'bucket'])

quarkFeatures['charge'] = [1,-1]

quarkFeatures['bucket'] = ['fermion', 'fermion']

charge bucket

0 1 fermion

1 -1 fermion

quarkFeatures['type']=quark

charge bucket type

0 1.0 fermion up

1 -1.0 fermion down

2 0.5 fermion beauty

There are no rows in this table

⁠

One of the coolest things about DataFrames are that they evolved to work with real data. In fact, you can

give Pandas a data file, and it will construct for you a DataFrame. Let's make a data file, and then load it up.

Load a CSV:

quanta = pd.read_csv('~/data/particles.csv')

examples in nature quanta charge type

0 sunlight photon 0.00 boson

1 lightning electron -1.00 fermion

2 matter up quark 0.66 fermion

3 matter down quark -0.33 fermion

4 moonlight photon 0.00 boson

quanta = pd.read_csv('~/data/particles.csv')

examples in nature quanta charge type

0 sunlight photon 0.00 boson

1 lightning electron -1.00 fermion

2 matter up quark 0.66 fermion

3 matter down quark -0.33 fermion

4 moonlight photon 0.00 boson

There are no rows in this table

⁠

Editing the DataFrame

Your DataFrame can be grouped, or reshuffled.

quantaTypeXCharge =

quanta.groupby(['charge'])[['quanta','type']].max()

quanta type

charge

-1.00 electron fermion

-0.33 down quark fermion

0.00 photon boson

0.66 up quark fermion

There are no rows in this table

⁠

If you need to get only subsets of data, you can slice them.

proton.loc[1:2]

cols = ['down', 'beauty']

print(proton.loc[1:2, cols])

cols = ['down', 'beauty']

print(proton.loc[1:2, cols])

There are no rows in this table

⁠

Plotting Data

Once your data is loaded into the frame, you can see it, by plotting it.

write line, scatter, and box plots:

line

cluster = pd.read_csv('~/data/cluster1.csv')

cluster.plot(kind= 'line', x='d', y='sc p')

plt.show()

scatter

cluster = pd.read_csv('~/data/cluster1.csv')

cluster.plot(kind= 'scatter', x='d', y='sc p')

box

cluster = pd.read_csv('~/data/cluster1.csv')

cluster.plot(kind= 'box', subplots=False, x='d', y=['sc p', 'cc p'])

There are no rows in this table

⁠

Merging DataFrames

Merging DataFrames is accomplished through stacking , joining, and merging

We can stack DataFrames as rows by appending. Append two Series (1-D DataFrames):

neutrinos = pd.Series(['electron neutrino',

'muon neutrino',

'tau neutrino'])

leptons = leptons.append(neutrinos)

print(leptons)

0 electron

1 muon

2 tau

0 electron neutrino

1 muon neutrino

2 tau neutrino

There are no rows in this table

⁠

We can stack both DataFrames vertically and horizontally by specifying the axis by concatenating.

fermions = pd.read_csv('~/data/fermions.csv')

bosons = pd.read_csv('~/data/bosons.csv')

buckets = [fermions, bosons]

print(pd.concat(buckets, axis=1))

quanta type quanta type

0 electron fermion photon boson

1 up quark fermion NaN NaN

2 down quark fermion NaN NaN

There are no rows in this table

⁠

Stack a multi-indexed DataFrames with concatenation, by adding keys, as well as axis, like so:

vertical =

pd.concat(buckets, keys=['fermions','bosons'], axis=0)

print(vertical)

horizontal =

pd.concat(buckets, keys=['fermions','bosons'], axis=0)

print(horizontal)

quanta type

fermions 0 electron fermion

1 up quark fermion

2 down quark fermion

bosons 0 photon boson

fermions bosons

quanta type quanta type

0 electron fermion photon boson

1 up quark fermion NaN NaN

2 down quark fermion NaN NaN

There are no rows in this table

⁠

Inner and outer joins are another way to merge DataFrames. You give a key and axis, and specify the join.

You can also join multiple merged DataFrames.

joinedBucket = [fermions,bosons]

joinedBucket = pd.concat(joinedBucket, keys=['type', 'charge'],axis=1, join='inner')

print(joinedBucket)

type charge

quanta type quanta type

0 electron fermion photon boson

1 up quark fermion NaN NaN

2 down quark fermion NaN NaN

There are no rows in this table

⁠

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.