JavaScript required
We’re sorry, but Coda doesn’t work properly without JavaScript enabled.
Skip to content
Gallery
rootpath
creator
rootpath
More
Share
Explore
root
Garden
DataFrames
The DataFrame is like a super data structure, holding other data structures inside of it.
Construction
Think of some data that you can put into sequences of containers (a.k.a. rows of columns)
3 x 4 dataset:
1
2
3
There are no rows in this table
DataFrames are made from rows and columns. The rows and columns have a list-like structure, and can be
constructed with lists.
Create an empty DataFrame, and give it a column and a list. Then, create a another, and
add it to the first.
Code DataFrame: sql, my project,
1
garden
= root.garden
2
quark[
'type'
]=[
moonflower1
,
moonflower2
]
type
0 moonflower1
1 moonflower2
2
3
quarkFeatures = pd.DataFrame(
columns
=[
'charge'
,
'bucket'
])
quarkFeatures[
'charge'
] = [1
,
-1]
quarkFeatures[
'bucket'
] = [
'fermion'
,
'fermion'
]
charge bucket
0 1 fermion
1 -1 fermion
4
quarkFeatures[
'type'
]=quark
charge bucket type
0 1.0 fermion up
1 -1.0 fermion down
2 0.5 fermion beauty
There are no rows in this table
One of the coolest things about DataFrames are that they evolved to work with real data. In fact, you can
give Pandas a data file, and it will construct for you a DataFrame. Let's make a data file, and then load it up.
Load a CSV:
1
quanta = pd.read_csv('~/data/particles.csv')
examples in nature quanta charge type
0 sunlight photon 0.00 boson
1 lightning electron -1.00 fermion
2 matter up quark 0.66 fermion
3 matter down quark -0.33 fermion
4 moonlight photon 0.00 boson
2
quanta = pd.read_csv(
'~/data/particles.csv'
)
examples in nature quanta charge type
0 sunlight photon 0.00 boson
1 lightning electron -1.00 fermion
2 matter up quark 0.66 fermion
3 matter down quark -0.33 fermion
4 moonlight photon 0.00 boson
There are no rows in this table
Editing the DataFrame
Your DataFrame can be
grouped
, or reshuffled.
1
quantaTypeXCharge =
quanta.groupby([
'charge'
])[[
'quanta'
,
'type'
]].max()
quanta type
charge
-1.00 electron fermion
-0.33 down quark fermion
0.00 photon boson
0.66 up quark fermion
There are no rows in this table
If you need to get only subsets of data, you can
slice
them.
1
proton.loc[1:2]
proton.loc[1:2]
2
cols = ['down', 'beauty']
print(proton.loc[1:2, cols])
cols = ['down', 'beauty']
print(proton.loc[1:2, cols])
There are no rows in this table
Plotting Data
Once your data is loaded into the frame, you can see it, by plotting it.
write line, scatter, and box plots:
1
line
cluster = pd.read_csv(
'~/data/cluster1.csv'
)
cluster.plot(
kind
=
'line'
, x
=
'd'
, y
=
'sc p'
)
plt.show()
2
scatter
cluster = pd.read_csv(
'~/data/cluster1.csv'
)
cluster.plot(
kind
=
'scatter'
, x
=
'd'
, y
=
'sc p'
)
3
box
cluster = pd.read_csv(
'~/data/cluster1.csv'
)
cluster.plot(
kind
=
'box'
, subplots
=
False, x
=
'd'
, y
=[
'sc p'
,
'cc p'
])
There are no rows in this table
Merging DataFrames
Merging DataFrames is accomplished through stacking , joining, and merging
We can stack DataFrames as rows by
appending
. Append two Series (1-D DataFrames):
1
neutrinos = pd.Series([
'electron neutrino'
,
'muon neutrino'
,
'tau neutrino'
])
leptons = leptons.append(neutrinos)
print(leptons)
0 electron
1 muon
2 tau
0 electron neutrino
1 muon neutrino
2 tau neutrino
There are no rows in this table
We can stack both DataFrames vertically and horizontally by specifying the
axis
by
concatenating.
1
fermions = pd.read_csv(
'~/data/fermions.csv'
)
bosons = pd.read_csv(
'~/data/bosons.csv'
)
buckets = [fermions
,
bosons]
print(pd.concat(buckets
, axis
=1))
quanta type quanta type
0 electron fermion photon boson
1 up quark fermion NaN NaN
2 down quark fermion NaN NaN
There are no rows in this table
Stack a multi-indexed DataFrames with concatenation, by adding
keys
, as well as axis, like so:
1
vertical =
pd.concat(buckets
, keys
=[
'fermions'
,
'bosons'
]
, axis
=0)
print(vertical)
horizontal =
pd.concat(buckets
, keys
=[
'fermions'
,
'bosons'
]
, axis
=0)
print(horizontal)
quanta type
fermions 0 electron fermion
1 up quark fermion
2 down quark fermion
bosons 0 photon boson
fermions bosons
quanta type quanta type
0 electron fermion photon boson
1 up quark fermion NaN NaN
2 down quark fermion NaN NaN
There are no rows in this table
Inner and outer joins
are another way to merge DataFrames. You give a key and axis, and specify the join.
You can also join multiple merged DataFrames.
1
joinedBucket = [fermions
,
bosons]
joinedBucket = pd.concat(joinedBucket
, keys
=[
'type'
,
'charge'
]
,axis
=1
, join
=
'inner'
)
print(joinedBucket)
type charge
quanta type quanta type
0 electron fermion photon boson
1 up quark fermion NaN NaN
2 down quark fermion NaN NaN
There are no rows in this table
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
Ctrl
P
) instead.