305 liens privés
Python build-in functions
read(): reads from the file based on the number of bytes.
readline(): reads the entire line if no arguments are passed
readlines(): reads all the lines or remaining lines from the file
Python csv library
csv.reader(): reads all lines in the given file
csv.DictReader(): if the file has headers (normally the first row that identifies each filed of data), this function reads each line as a dict with the headers as keys
Import data using Pandas
pd.read_csv(): reads a csv file into DataFrame
pd.read_excel(): reads an excel file into DataFrame
Options for importing large size data
dask.dataframe(): large parallel DataFrame composed of many smaller Pandas DataFrames, split along the index
datatable: a Python package for manipulating big 2-dimensional tabular data structures (aka data frames, up to 100GB)
data and notebook can be accessed from my Github