pandas – DataFrame のインデックス操作まとめ

概要

DataFrame の値を取得、設定する方法について解説します。

DataFrame の行インデックスを取得する

pandas.DataFrame.index で DataFrame の行のインデックスを取得できます。

In [1]:

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.arange(1, 26).reshape(5, 5),
    index=["a", "b", "c", "d", "e"],
    columns=["A", "B", "C", "D", "E"],
)
df

	A	B	C	D	E
a	1	2	3	4	5
b	6	7	8	9	10
c	11	12	13	14	15
d	16	17	18	19	20
e	21	22	23	24	25

In [2]:

print(df.index)

Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

DataFrame の列インデックスを取得する

pandas.DataFrame.columns または pandas.DataFrame.keys() で DataFrame の列のインデックスを取得できます。また、DataFrame を iterate すると、列のインデックスが返ります。

In [3]:

print(df.columns)
print(df.keys())

for i in df:
    print(i, end=" ")

Index(['A', 'B', 'C', 'D', 'E'], dtype='object')
Index(['A', 'B', 'C', 'D', 'E'], dtype='object')
A B C D E

DataFrame を1列ずつ iterate する

pandas.DataFrame.iteritems() 及び pandas.DataFrame.items() で、(列のインデックス, 列) を返す iterable オブジェクトを取得できます。

In [4]:

for name, col in df.items():  # または df.iteritems()
    print(f"name={name}")
    print(col, end="\n\n")

name=A
a     1
b     6
c    11
d    16
e    21
Name: A, dtype: int64

name=B
a     2
b     7
c    12
d    17
e    22
Name: B, dtype: int64

name=C
a     3
b     8
c    13
d    18
e    23
Name: C, dtype: int64

name=D
a     4
b     9
c    14
d    19
e    24
Name: D, dtype: int64

name=E
a     5
b    10
c    15
d    20
e    25
Name: E, dtype: int64

DataFrame を1行ずつ iterate する

pandas.DataFrame.iterrows() で行ごとに iterate できます。各行は Series として取り出せれます。

In [5]:

for i, row in df.iterrows():
    print(f"\nindex={i}")
    print(row)

index=a
A    1
B    2
C    3
D    4
E    5
Name: a, dtype: int64

index=b
A     6
B     7
C     8
D     9
E    10
Name: b, dtype: int64

index=c
A    11
B    12
C    13
D    14
E    15
Name: c, dtype: int64

index=d
A    16
B    17
C    18
D    19
E    20
Name: d, dtype: int64

index=e
A    21
B    22
C    23
D    24
E    25
Name: e, dtype: int64

pandas.DataFrame.itertuples() で行ごとに iterate できます。各行は namedtuple として取り出せれます。

In [6]:

for row in df.itertuples():
    print(row)

Pandas(Index='a', A=1, B=2, C=3, D=4, E=5)
Pandas(Index='b', A=6, B=7, C=8, D=9, E=10)
Pandas(Index='c', A=11, B=12, C=13, D=14, E=15)
Pandas(Index='d', A=16, B=17, C=18, D=19, E=20)
Pandas(Index='e', A=21, B=22, C=23, D=24, E=25)

DataFrame の選択した範囲の値を取得または設定する

範囲の選択方法はインデックスに基づく選択と位置に基づく選択があります。

at、loc、__getitem__()、iat、iloc は選択範囲に値を設定することもできます。

In [7]:

df2 = df.copy()
df2.at["b", "B"] = 100
df2

	A	B	C	D	E
a	1	2	3	4	5
b	6	100	8	9	10
c	11	12	13	14	15
d	16	17	18	19	20
e	21	22	23	24	25

In [8]:

df2 = df.copy()
df2.loc["b":"d", "B":"D"] = 100
df2

	A	B	C	D	E
a	1	2	3	4	5
b	6	100	100	100	10
c	11	100	100	100	15
d	16	100	100	100	20
e	21	22	23	24	25

In [9]:

df2 = df.copy()
df2["B"] = 100
df2

	A	B	C	D	E
a	1	100	3	4	5
b	6	100	8	9	10
c	11	100	13	14	15
d	16	100	18	19	20
e	21	100	23	24	25

In [10]:

df2 = df.copy()
df2.iat[1, 1] = 100
df2

	A	B	C	D	E
a	1	2	3	4	5
b	6	100	8	9	10
c	11	12	13	14	15
d	16	17	18	19	20
e	21	22	23	24	25

In [11]:

df2 = df.copy()
df2.iloc[1:3, 1:3] = 100
df2

	A	B	C	D	E
a	1	2	3	4	5
b	6	100	100	9	10
c	11	100	100	14	15
d	16	17	18	19	20
e	21	22	23	24	25

行インデックス (index) に基づく選択

	取得	代入	Indexing	Slice	Bool Indexing	Fancy Indexing
pandas.DataFrame.xs()	◯	✕	df.xs(index, axis=0)
pandas.DataFrame.loc	◯	◯	df.loc[index]	df.loc[slice]	df.loc[bools]	df.loc[indices]

Indexing

In [12]:

df = pd.DataFrame(
    np.arange(1, 26).reshape(5, 5),
    index=["a", "b", "c", "d", "e"],
    columns=["A", "B", "C", "D", "E"],
)

df.loc["b"]  # または df.xs("b", axis=0)

A     6
B     7
C     8
D     9
E    10
Name: b, dtype: int64

Slice

list のスライスと異なり、インデックスのスライスは終端も含みます。

In [13]:

df.loc["b":"d"]

	A	B	C	D	E
b	6	7	8	9	10
c	11	12	13	14	15
d	16	17	18	19	20

Bool Indexing

In [14]:

df.loc[[True, False, True, False, True]]

	A	B	C	D	E
a	1	2	3	4	5
c	11	12	13	14	15
e	21	22	23	24	25

Fancy Indexing

In [15]:

df.loc[["a", "c", "e"]]

	A	B	C	D	E
a	1	2	3	4	5
c	11	12	13	14	15
e	21	22	23	24	25

列インデックス (columns) に基づく選択

	取得	代入	Indexing	Slice	Bool Indexing	Fancy Indexing
pandas.DataFrame.get()	◯	✕	df.get(column)
pandas.DataFrame.xs()	◯	✕	df.xs(column, axis=1)
pandas.DataFrame.loc	◯	◯	df.loc[:, column]	df.loc[:, slice]	df.loc[:, bools]	df.loc[:, columns]
`__getitem__()`	◯	◯	df			df[columns]

Indexing

In [16]:

df["B"]  # または df.get("B")、df.xs("B", axis=1)、df.loc[:, "B"]

a     2
b     7
c    12
d    17
e    22
Name: B, dtype: int64

get() は指定したインデックスが存在しない場合は None を返します。存在しない場合に返す値を default に指定できます。

In [17]:

print(df.get("F"))
print(df.get("F", default=-1))

None
-1

Slice

list のスライスと異なり、インデックスのスライスは終端も含みます。

In [18]:

df.loc[:, "B":"D"]

	B	C	D
a	2	3	4
b	7	8	9
c	12	13	14
d	17	18	19
e	22	23	24

Bool Indexing

In [19]:

df.loc[:, [True, False, True, False, True]]

	A	C	E
a	1	3	5
b	6	8	10
c	11	13	15
d	16	18	20
e	21	23	25

Fancy Indexing

In [20]:

df[["A", "C", "E"]]  # または df.loc[:, ["A", "C", "E"]]

	A	C	E
a	1	3	5
b	6	8	10
c	11	13	15
d	16	18	20
e	21	23	25

行及び列インデックスに基づく選択

	取得	代入	Indexing	Slice	Bool Indexing	Fancy Indexing
pandas.DataFrame.at	◯	◯	df.at[index, index]
pandas.DataFrame.loc	◯	◯	df.loc[index, index]	df.loc[slice, slice]	df.loc[bools, bools]	df.loc[indices, indices]

Indexing

In [21]:

print(df.at["b", "B"])  # または df.loc["b", "B"]

Fancy Indexing

In [22]:

df.loc[["a", "c", "e"], ["A", "C", "E"]]  # または df.loc["b", "B"]

	A	C	E
a	1	3	5
c	11	13	15
e	21	23	25

pandas.DataFrame.lookup() は、同じ長さのインデックスの配列 indices 及び columns を lookup(indices, columns) と指定すると、[df.loc[r, c] for r, c in zip(indices, columns)] を返します。

pandas.DataFrame.lookup

In [23]:

df.lookup(["a", "c", "e"], ["A", "A", "E"])
# df.at["a", "A"], df.at["c", "A"], df.at["e", "E"] を取得

array([ 1, 11, 25])

行の位置に基づく選択

	取得	代入	Indexing	Slice	Fancy Indexing
pandas.DataFrame.iloc	◯	◯	df.iloc[position]	df.iloc[slice]	df.iloc[positions]
pandas.DataFrame.take	◯	✕			df.take(positions, axis=0)

Indexing

In [24]:

print(df.iloc[1])

A     6
B     7
C     8
D     9
E    10
Name: b, dtype: int64

Slice

list のスライスと同様、位置のスライスは終端は含みません。

In [25]:

df.iloc[1:3]

	A	B	C	D	E
b	6	7	8	9	10
c	11	12	13	14	15

Fancy Indexing

In [26]:

df.iloc[[0, 2, 4]]  # または df.take([0, 2, 4], axis=0)

	A	B	C	D	E
a	1	2	3	4	5
c	11	12	13	14	15
e	21	22	23	24	25

列の位置に基づく選択

	取得	代入	Indexing	Slice	Fancy Indexing
pandas.DataFrame.iloc	◯	◯	df.iloc[:, position]	df.iloc[:, slice]	df.iloc[:, positions]
pandas.DataFrame.take	◯	✕			df.take(positions, axis=1)

Indexing

In [27]:

print(df.iloc[:, 1])

a     2
b     7
c    12
d    17
e    22
Name: B, dtype: int64

Slice

list のスライスと同様、位置のスライスは終端は含みません。

In [28]:

df.iloc[:, 1:3]

	B	C
a	2	3
b	7	8
c	12	13
d	17	18
e	22	23

Fancy Indexing

In [29]:

df.iloc[:, [0, 2, 4]]  # または df.take([0, 2, 4], axis=1)

	A	C	E
a	1	3	5
b	6	8	10
c	11	13	15
d	16	18	20
e	21	23	25

行及び列の位置に基づく選択

	取得	代入	Indexing	Slice	Fancy Indexing
pandas.DataFrame.iat	◯	◯	df.iat[position, position]
pandas.DataFrame.iloc	◯	◯	df.iloc[position, position]	df.iloc[slice, slice]	df.iloc[positions, positions]

Indexing

In [30]:

print(df.iat[1, 1])  # または df.iloc[1, 1]

Slice

In [31]:

df.iloc[1:3, 1:3]

	B	C
b	7	8
c	12	13

Fancy Indexing

In [32]:

df.iloc[[0, 2, 4], [0, 2, 4]]

	A	C	E
a	1	3	5
c	11	13	15
e	21	23	25

コメント一覧（0件）

たーくん より:

2021年11月10日 02:41

とても勉強になりました。ありがとうございます。
区切りがどこかわかりずらいです。改行入れるか線を入れるかしてほしいです。
広告があるとなお混乱します。涙

返信
- pystyle より:
  
  2021年11月10日 15:48
  
  コメントありがとうございます。
  ページ構成にわかりづらい点があり、すみません。
  今後記事内容やWeb デザインについて見直していこうと思います。
  
  返信

pandas – DataFrame のインデックス操作まとめ

概要

DataFrame の行インデックスを取得する

DataFrame の列インデックスを取得する

DataFrame を1列ずつ iterate する

DataFrame を1行ずつ iterate する

DataFrame の選択した範囲の値を取得または設定する

行インデックス (index) に基づく選択

列インデックス (columns) に基づく選択

行及び列インデックスに基づく選択

行の位置に基づく選択

列の位置に基づく選択

行及び列の位置に基づく選択

関連記事

コメント

コメント一覧 （0件）

コメントする コメントをキャンセル

コメント一覧（0件）

コメントするコメントをキャンセル