275 liens privés
Python build-in functions
read()
: reads from the file based on the number of bytes.
readline()
: reads the entire line if no arguments are passed
readlines()
: reads all the lines or remaining lines from the file
Python csv library
csv.reader()
: reads all lines in the given file
csv.DictReader()
: if the file has headers (normally the first row that identifies each filed of data), this function reads each line as a dict
with the headers as keys
Import data using Pandas
pd.read_csv()
: reads a csv file into DataFrame
pd.read_excel()
: reads an excel file into DataFrame
Options for importing large size data
dask.dataframe()
: large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index
datatable
: a Python package for manipulating big 2-dimensional tabular data structures (aka data frames, up to 100GB)
data and notebook can be accessed from my Github