Types

Common taxonomy of data types when considering machine learning

Categorical: qualitative
- Ordinal: innate ordered values with unknown distances between them that cannot be measured
  - e.g. first/second/third, good/bad
- Nominal: values (text or numbers) with no order
  - e.g. cat/dog, genre, ethnicity
Numerical: quantitative
- Discrete: quantitative whole number values
  - e.g. step count
- Continuous: quantitative decimal values
  - e.g. width, height

Quality

Training data should be representative of the data that will be predicted with
Sampling noise: small sample leads to models that provide imprecise predictions due to chance ¹
Sampling bias: data in the sample may have a higher or lower probability of occurring compared to the original data
Discard outliers
Ignore or impute missing values, or train models with and without those values and compare their performances
Feature engineering - feature selection (useful data), feature extraction (combining features to make more useful ones, e.g. dimensionality reduction), feature creation (new data)

More data supplied to simple algorithms can perform better than complex algorithms trained on smaller datasets
Trade off- cost of acquiring and storing more data vs tuning algorithms

Banko, M. and Eric Brill. 2001. Scaling to very very large corpora for natural language disambiguation. Association for Computational Linguistics, USA, 26–33.
Halevy, A., Norvig, P. and Fernando, N. (2009). The Unreasonable Effectiveness of Data. Intelligent Systems, IEEE. 24. 8 - 12. 10.1109/MIS.2009.36.
Geron, A. (2017) Hands-On Machine Learning with Scikit-Learn & TensorFlow : concepts, tools, and techniques to build intelligent systems.