87 Artificial Intelligence and Big Data
87.1 Artificial Intelligence — Concept
Artificial Intelligence (AI) = the science of building machines that can perform tasks requiring human-like intelligence — learning, reasoning, perception, problem-solving and language understanding. The term was coined at the Dartmouth Conference (1956) by John McCarthy, Marvin Minsky, Claude Shannon and Nathaniel Rochester. Alan Turing’s “Computing Machinery and Intelligence” (1950) with the Turing Test laid the philosophical foundation.
87.2 AI Eras
| Era | Period | Highlights |
|---|---|---|
| Symbolic AI | 1950s-70s | Logic, expert systems |
| First AI Winter | mid-70s | Funding cuts |
| Expert Systems | 1980s | MYCIN, DENDRAL |
| Second AI Winter | late 80s-90s | |
| Machine Learning rise | 1990s-2000s | Statistical ML |
| Deep Learning | 2012- | AlexNet, GPUs, big data |
| Generative AI / Foundation Models | 2022- | ChatGPT, GPT-4, Gemini, Claude |
87.3 Types of AI
-
By capability:
- ANI (Narrow / Weak AI) — task-specific (current).
- AGI (General AI) — human-level cognition (aspirational).
- ASI (Super AI) — exceeds human; future.
-
By functionality:
- Reactive machines — Deep Blue.
- Limited memory — modern ML.
- Theory of mind.
- Self-aware.
87.4 ML / DL Approaches
- Supervised — labelled data (classification, regression).
- Unsupervised — unlabelled (clustering, dimensionality reduction).
- Semi-supervised.
- Self-supervised — modern LLMs.
- Reinforcement Learning (RL) — agent + reward (AlphaGo, autonomous driving).
- Transfer learning.
- Federated learning — distributed; privacy.
- Active learning.
87.5 Key AI Algorithms
- Linear / Logistic Regression.
- Decision Trees · Random Forests · XGBoost / LightGBM.
- k-NN · Naive Bayes.
- SVM (Support Vector Machine).
- k-Means · Hierarchical · DBSCAN clustering.
- PCA · t-SNE · UMAP dimensionality reduction.
- Neural Networks · CNN (vision) · RNN/LSTM (sequence) · Transformer (Vaswani 2017) · GAN (Goodfellow 2014) · Diffusion Models.
- Reinforcement Learning — Q-learning, DQN, PPO.
87.6 Generative AI
- GPT-1 OpenAI 2018.
- BERT Google 2018.
- GPT-3 2020 (175B parameters).
- DALL-E image generation 2021.
- ChatGPT (Nov 2022) — broke 100m users in 2 months.
- GPT-4 2023.
- Gemini Google 2023.
- Claude Anthropic 2023.
- Llama Meta 2023.
- Mistral, Falcon, Phi open weights.
- Sora video generation OpenAI 2024.
- Foundation Models — Stanford 2021 term.
- Indian: BharatGPT, Krutrim (Ola), Sarvam AI, AI4Bharat (IIT-Madras).
87.7 AI Applications in Management
- Marketing: personalisation, recommender systems, dynamic pricing, churn prediction, sentiment analysis.
- HR: resume screening, predictive turnover, learning paths, chatbots.
- Finance: fraud detection, credit scoring, robo-advisors, algorithmic trading, AML.
- Operations: predictive maintenance, demand forecasting, quality inspection (computer vision).
- SCM: route optimisation, inventory.
- CX: chatbots, voice assistants.
- Strategy: scenario simulation, M&A targeting.
- Risk: cybersecurity threat detection.
- Customer service: Zendesk, Salesforce Einstein.
87.8 Big Data — Concept
Big Data = datasets too large or complex for traditional databases to handle. Doug Laney (Gartner 2001) defined the 3 Vs: Volume, Velocity, Variety. Extended to 5 Vs (+ Veracity, Value) and 7 Vs (+ Variability, Visualisation).
87.9 V Framework for Big Data
- Volume — petabytes/exabytes.
- Velocity — streaming, real-time.
- Variety — structured, semi-structured, unstructured.
- Veracity — data quality, trust.
- Value — business outcome.
- Variability — changing meaning.
- Visualisation.
87.10 Big Data Technologies
- Hadoop — distributed storage (HDFS) + processing (MapReduce); Apache 2006.
- Apache Spark — in-memory, replacing MapReduce.
- Hive · Pig — SQL-like over Hadoop.
- NoSQL DBs: MongoDB, Cassandra, HBase, Couchbase.
- Kafka — real-time streaming.
- Flink, Storm — stream processing.
- Cloud big data: AWS Redshift / EMR · GCP BigQuery · Azure Synapse · Snowflake · Databricks.
- Data lake / Data lakehouse — Databricks term.
- Delta Lake, Apache Iceberg, Hudi.
- Python · R · Scala — languages.
- Tableau · Power BI · Looker · Qlik — viz.
87.11 Analytics Maturity (Gartner)
- Descriptive — what happened?
- Diagnostic — why?
- Predictive — what will happen?
- Prescriptive — what should we do?
87.12 Indian Context
- NITI Aayog National AI Strategy (2018) — #AIforAll.
- IndiaAI Mission (2024) — INR 10,000+ Cr.
- Bhashini — language-translation platform.
- AI4Bharat at IIT Madras.
- CDAC, IISc — research.
- Indian AI companies: TCS, Infosys, Wipro, Mindtree, Fractal, Mu Sigma, Mathco, Tiger Analytics, LatentView, Tredence.
- Startups: Niramai, Haptik, Yellow.ai, Sigmoid.
- Data centres: Mumbai, Hyderabad, Chennai hubs.
- DPDP Act 2023 governs personal data.
87.13 Ethics, Risk and Governance
- Bias and fairness — algorithmic discrimination.
- Transparency / Explainability (XAI).
- Accountability.
- Privacy.
- Safety / Alignment.
- Job displacement.
- Deepfakes and misinformation.
- Frameworks: EU AI Act (2024) · NIST AI RMF · IEEE Ethically Aligned · OECD AI Principles · UNESCO AI Ethics.
- India: NITI Aayog Responsible AI; DPDP Act.
87.14 Modern Trends
- Generative AI everywhere.
- Multimodal models — text + image + audio + video.
- AI agents / autonomous agents.
- RAG (Retrieval-Augmented Generation).
- Small language models (SLMs).
- AI in scientific discovery — AlphaFold.
- Edge AI / TinyML.
- AI chips — Nvidia H100, TPU, Cerebras.
- Synthetic data.
- Quantum ML.
- AI safety and alignment — Anthropic, OpenAI.
- Sustainable AI / Carbon footprint.
- AI regulation worldwide.
87.15 Practice Questions
"Artificial Intelligence" was coined at:
View solution
Turing Test (1950) is to determine:
View solution
The original 3 Vs of Big Data are:
View solution
Hadoop uses processing framework:
View solution
Transformer architecture (2017) is from:
View solution
ChatGPT was launched in:
View solution
Supervised learning uses:
View solution
Reinforcement learning is best illustrated by:
View solution
"What will happen?" is which analytics?
View solution
NITI Aayog's AI strategy is themed:
View solution
GANs were proposed by:
View solution
"Foundation Models" term was coined at:
View solution
EU AI Act was adopted in:
View solution
Bhashini is an Indian platform for:
View solution
RAG stands for:
View solution
87.15.1 Advanced Format Questions
A: Big Data is characterised by 3 Vs.
R: Volume, Velocity, Variety (Laney, Gartner 2001).
View solution
ML types: (i) Supervised. (ii) Unsupervised. (iii) Reinforcement. (iv) Self-supervised.
View solution
Analytics maturity (Gartner): (i) Descriptive. (ii) Diagnostic. (iii) Predictive. (iv) Prescriptive.
View solution
87.16 Quick Recall
- AI: Dartmouth 1956 (McCarthy); Turing 1950 imitation game.
- AI eras: Symbolic → Winters → Expert Systems → ML → DL (2012 AlexNet) → Generative AI (2022 ChatGPT).
- AI types: ANI/AGI/ASI; Reactive/Limited memory/ToM/Self-aware.
- ML: Supervised · Unsupervised · Semi · Self-supervised · RL (AlphaGo) · Transfer · Federated · Active.
- Algorithms: Regression · DT · RF · XGBoost · k-NN · NB · SVM · k-Means · PCA/tSNE/UMAP · NN · CNN · RNN/LSTM · Transformer (Vaswani 2017) · GAN (Goodfellow 2014) · Diffusion.
- GenAI: GPT · BERT · DALL-E · ChatGPT Nov 2022 · GPT-4 · Gemini · Claude · Llama · Mistral · Sora; Indian — BharatGPT · Krutrim · Sarvam · AI4Bharat.
- Mgmt applications: Marketing personalisation · HR · Fraud · Predictive maintenance · SCM · CX · Strategy · Cyber.
- Big Data (Laney Gartner 2001): 3 Vs (Volume · Velocity · Variety); 5 Vs + Veracity, Value; 7 Vs + Variability, Visualisation.
- Tech stack: Hadoop (2006) + MapReduce · HDFS · Spark · Hive · Pig · NoSQL (Mongo, Cassandra, HBase) · Kafka · Flink · cloud (Redshift, BigQuery, Synapse, Snowflake, Databricks) · data lake/lakehouse · Delta Lake.
- Analytics maturity (Gartner): Descriptive · Diagnostic · Predictive · Prescriptive.
- India: NITI #AIforAll 2018 · IndiaAI Mission 2024 · Bhashini · AI4Bharat · CDAC · IISc; firms — TCS · Infosys · Fractal · Mu Sigma · Tiger Analytics · LatentView.
- Ethics: bias · XAI · accountability · privacy · safety/alignment · deepfakes; EU AI Act 2024 · NIST AI RMF · OECD · UNESCO; DPDP Act 2023.
- Modern: GenAI · multimodal · agents · RAG · SLMs · AlphaFold · Edge AI · Nvidia H100/TPU · synthetic data · quantum ML · safety/alignment · sustainable AI.