numpy – loadtxt、savetxt の使い方

概要

テキストファイルからデータを読み込む numpy.load() 及び配列をテキストファイルに保存する numpy.save() の使い方を解説します。

関数一覧

numpy.loadtxt

テキストファイルからデータを読み込みます。

numpy.loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes')

引数

名前	型	デフォルト値
fname	file, str, pathlib.Path
fname	読み込むファイルオブジェクト、ファイル名、またはジェネレータ。ファイル名の拡張子が .gz または .bz2 の場合、読み込む前にファイルは解凍されます。ジェネレーターの場合、バイト列を返す必要があります。
dtype	data-type
dtype	返される配列のデータ型。構造化データ型の場合、各行を配列の要素とした1次元配列が返されます。
comments	str sequence of str	‘#’
comments	コメントの開始を示す文字または文字のリスト。None はコメントがないことを意味します。
delimiter	str	None
delimiter	列を区切る文字列。デフォルトは空白です。
converters	dict	None
converters	文字列に変換処理を行いたい場合は、その関数を dict で指定します。欠損値を NaN に変換する用途などで利用できます。
skiprows	int	0
skiprows	先頭 skiprows 行は読み飛ばします。
usecols	int sequence	None
usecols	一部の列のみ使用する場合は指定します。
unpack	bool	False
unpack	True の場合、返される配列は転置され、x, y, z = loadtxt(…) のようにタプル展開ができるようになります。構造化データ型の場合、各列を配列の要素とした1次元配列が返されます。
ndmin	int	0
ndmin	返される配列は、少なくとも ndmin 次元になります。0, 1, 2 のいずれかで指定します。
encoding	str	‘bytes’
encoding	エンコード方式

返り値

名前	説明
out	テキストファイルから読み込んだデータ

サンプルコード

In [1]:

from io import StringIO

import numpy as np

# 空白区切りの場合
text = """1 2 3
4 5 6
7 8 9
"""

a = np.loadtxt(StringIO(text))
print(a)

# ヘッダー付きのカンマ区切りの場合
text = """a,b,c
1,2,3
4,5,6
7,8,9
"""

a = np.loadtxt(StringIO(text), delimiter=",", skiprows=1)
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

返される配列のデータ型を指定する – dtype

dtype で返される配列のデータ型を指定できます。

In [2]:

text = """1 2 3
4 5 6
7 8 9
"""

# デフォルトは float
a = np.loadtxt(StringIO(text))
print(a)

a = np.loadtxt(StringIO(text), dtype=int)
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]
[[1 2 3]
 [4 5 6]
 [7 8 9]]

コメントを無視する – comments

comments にコメントの開始を示す文字を指定することで、コメントを無視できます。

サンプルコード

In [3]:

text = """1 2 3
4 5 6
7 8 9
# comment
"""

# "#" で始まるコメントは無視する。
a = np.loadtxt(StringIO(text), comments="#")
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

列を区切る文字列を指定する – delimiter

delimiter で列を区切る文字列を指定できます。デフォルトでは、空白が区切り文字として解釈されます。例えば、カンマ区切りであれば delimiter=","、タブ区切りであれば delimiter="\t"のように指定します。

In [4]:

text = """1,2,3
4,5,6
7,8,9
"""

# "#" で始まるコメントは無視する。
a = np.loadtxt(StringIO(text), delimiter=",")
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

先頭の数行を読み飛ばす – skiprows

先頭行がヘッダーの場合、skiprows を指定して、読み飛ばすようにします。

In [5]:

text = """a b c
1 2 3
4 5 6
7 8 9
"""

# 先頭 1 行を読み飛ばす。
a = np.loadtxt(StringIO(text), skiprows=1)
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]

一部の列のみ使用する – usecols

一部の列のみ使用する場合、usecols で使用する列を指定します。

In [6]:

text = """1 2 3
4 5 6
7 8 9
"""

# 1列目と3列目だけ使用する。
a = np.loadtxt(StringIO(text), usecols=[0, 2])
print(a)

[[1. 3.]
 [4. 6.]
 [7. 9.]]

列ごとにタプル展開する – unpack

unpack=True の場合、返される配列の形状は (列数, 行数) となり、列ごとにタプル展開で受け取れるようになります。

In [7]:

text = """1 2 3
4 5 6
7 8 9
"""

x, y, z = np.loadtxt(StringIO(text), unpack=True)
print(x, y, z)

[1. 4. 7.] [2. 5. 8.] [3. 6. 9.]

numpy.savetxt

配列をテキストファイルに保存します。

numpy.savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='', footer='', comments='# ', encoding=None)

引数

名前	型	デフォルト値
fname	filename, file handle
fname	保存するファイル名。ファイル名が .gz で終わる場合、ファイルは非圧縮の gzip 形式で保存されます。
X	1D, 2D array_like
X	テキストファイルに保存するデータ。
fmt	str sequence of strs	‘%.18e’
fmt	出力値の書式
delimiter	str	‘ ‘
delimiter	列を区切る文字列。デフォルトは空白です。
newline	str	‘\n’
newline	行を区切る文字列
header	str	”
header	ファイルの先頭に書き込まれる文字列
footer	str	”
footer	ファイルの末尾に書き込まれる文字列
comments	str	‘# ‘
comments	コメントの開始を示す文字
encoding	{None, str}	None
encoding	エンコード方式

サンプルコード

In [8]:

a = np.random.rand(4, 3)

# 空白区切りの場合
np.savetxt("output.txt", a)
!cat output.txt

3.536715494520643599e-01 8.698075387516082113e-01 2.170667168630509014e-01
6.114050471918656138e-01 3.637238185128842671e-01 2.572935765539627884e-01
5.300518432840184424e-01 5.182723624511025307e-01 2.560469493345988168e-01
3.566973853985763165e-01 4.672991873579654953e-01 7.585226330738038536e-01

In [9]:

# ヘッダー付きのカンマ区切りの場合
np.savetxt("output.txt", a, delimiter=",", header="a,b,c", comments="")
!cat output.txt

a,b,c
3.536715494520643599e-01,8.698075387516082113e-01,2.170667168630509014e-01
6.114050471918656138e-01,3.637238185128842671e-01,2.572935765539627884e-01
5.300518432840184424e-01,5.182723624511025307e-01,2.560469493345988168e-01
3.566973853985763165e-01,4.672991873579654953e-01,7.585226330738038536e-01

出力値の書式を指定する – fmt

デフォルトでは、出力値の書式は指数表記で小数点以下18桁になります。 fmt でこの出力値の書式を変更できます。

In [10]:

# 小数点以下3桁
np.savetxt("output.txt", a, fmt="%.3f")
!cat output.txt

0.354 0.870 0.217
0.611 0.364 0.257
0.530 0.518 0.256
0.357 0.467 0.759

In [11]:

# 指数表記で小数点以下3桁
np.savetxt("output.txt", a, fmt="%.3e")
!cat output.txt

3.537e-01 8.698e-01 2.171e-01
6.114e-01 3.637e-01 2.573e-01
5.301e-01 5.183e-01 2.560e-01
3.567e-01 4.673e-01 7.585e-01

In [12]:

a = np.random.randint(1, 5, (4, 3), dtype=int)
                      
# 整数を小数表記にしない
np.savetxt("output.txt", a, fmt="%.0f")
!cat output.txt

numpy – loadtxt、savetxt の使い方

概要

関数一覧

numpy.loadtxt

返される配列のデータ型を指定する – dtype

コメントを無視する – comments

列を区切る文字列を指定する – delimiter

先頭の数行を読み飛ばす – skiprows

一部の列のみ使用する – usecols

列ごとにタプル展開する – unpack

numpy.savetxt

出力値の書式を指定する – fmt

コメント

コメントする コメントをキャンセル

コメントするコメントをキャンセル