87  Artificial Intelligence and Big Data

87.1 Artificial Intelligence — Concept

Artificial Intelligence (AI) = the science of building machines that can perform tasks requiring human-like intelligence — learning, reasoning, perception, problem-solving and language understanding. The term was coined at the Dartmouth Conference (1956) by John McCarthy, Marvin Minsky, Claude Shannon and Nathaniel Rochester. Alan Turing’s “Computing Machinery and Intelligence” (1950) with the Turing Test laid the philosophical foundation.

87.2 AI Eras

TipAI eras
Era Period Highlights
Symbolic AI 1950s-70s Logic, expert systems
First AI Winter mid-70s Funding cuts
Expert Systems 1980s MYCIN, DENDRAL
Second AI Winter late 80s-90s
Machine Learning rise 1990s-2000s Statistical ML
Deep Learning 2012- AlexNet, GPUs, big data
Generative AI / Foundation Models 2022- ChatGPT, GPT-4, Gemini, Claude

87.3 Types of AI

TipTypes of AI
  • By capability:
    • ANI (Narrow / Weak AI) — task-specific (current).
    • AGI (General AI) — human-level cognition (aspirational).
    • ASI (Super AI) — exceeds human; future.
  • By functionality:
    • Reactive machines — Deep Blue.
    • Limited memory — modern ML.
    • Theory of mind.
    • Self-aware.

87.4 ML / DL Approaches

TipML / DL types
  • Supervised — labelled data (classification, regression).
  • Unsupervised — unlabelled (clustering, dimensionality reduction).
  • Semi-supervised.
  • Self-supervised — modern LLMs.
  • Reinforcement Learning (RL) — agent + reward (AlphaGo, autonomous driving).
  • Transfer learning.
  • Federated learning — distributed; privacy.
  • Active learning.

87.5 Key AI Algorithms

TipMajor algorithms
  • Linear / Logistic Regression.
  • Decision Trees · Random Forests · XGBoost / LightGBM.
  • k-NN · Naive Bayes.
  • SVM (Support Vector Machine).
  • k-Means · Hierarchical · DBSCAN clustering.
  • PCA · t-SNE · UMAP dimensionality reduction.
  • Neural Networks · CNN (vision) · RNN/LSTM (sequence) · Transformer (Vaswani 2017) · GAN (Goodfellow 2014) · Diffusion Models.
  • Reinforcement Learning — Q-learning, DQN, PPO.

87.6 Generative AI

TipGenerative AI landmarks
  • GPT-1 OpenAI 2018.
  • BERT Google 2018.
  • GPT-3 2020 (175B parameters).
  • DALL-E image generation 2021.
  • ChatGPT (Nov 2022) — broke 100m users in 2 months.
  • GPT-4 2023.
  • Gemini Google 2023.
  • Claude Anthropic 2023.
  • Llama Meta 2023.
  • Mistral, Falcon, Phi open weights.
  • Sora video generation OpenAI 2024.
  • Foundation Models — Stanford 2021 term.
  • Indian: BharatGPT, Krutrim (Ola), Sarvam AI, AI4Bharat (IIT-Madras).

87.7 AI Applications in Management

TipAI in management functions
  • Marketing: personalisation, recommender systems, dynamic pricing, churn prediction, sentiment analysis.
  • HR: resume screening, predictive turnover, learning paths, chatbots.
  • Finance: fraud detection, credit scoring, robo-advisors, algorithmic trading, AML.
  • Operations: predictive maintenance, demand forecasting, quality inspection (computer vision).
  • SCM: route optimisation, inventory.
  • CX: chatbots, voice assistants.
  • Strategy: scenario simulation, M&A targeting.
  • Risk: cybersecurity threat detection.
  • Customer service: Zendesk, Salesforce Einstein.

87.8 Big Data — Concept

Big Data = datasets too large or complex for traditional databases to handle. Doug Laney (Gartner 2001) defined the 3 Vs: Volume, Velocity, Variety. Extended to 5 Vs (+ Veracity, Value) and 7 Vs (+ Variability, Visualisation).

87.9 V Framework for Big Data

TipBig Data Vs
  • Volume — petabytes/exabytes.
  • Velocity — streaming, real-time.
  • Variety — structured, semi-structured, unstructured.
  • Veracity — data quality, trust.
  • Value — business outcome.
  • Variability — changing meaning.
  • Visualisation.

87.10 Big Data Technologies

TipBig Data stack
  • Hadoop — distributed storage (HDFS) + processing (MapReduce); Apache 2006.
  • Apache Spark — in-memory, replacing MapReduce.
  • Hive · Pig — SQL-like over Hadoop.
  • NoSQL DBs: MongoDB, Cassandra, HBase, Couchbase.
  • Kafka — real-time streaming.
  • Flink, Storm — stream processing.
  • Cloud big data: AWS Redshift / EMR · GCP BigQuery · Azure Synapse · Snowflake · Databricks.
  • Data lake / Data lakehouse — Databricks term.
  • Delta Lake, Apache Iceberg, Hudi.
  • Python · R · Scala — languages.
  • Tableau · Power BI · Looker · Qlik — viz.

87.11 Analytics Maturity (Gartner)

TipGartner Analytic Continuum
  • Descriptive — what happened?
  • Diagnostic — why?
  • Predictive — what will happen?
  • Prescriptive — what should we do?

87.12 Indian Context

TipIndia AI and Big Data
  • NITI Aayog National AI Strategy (2018) — #AIforAll.
  • IndiaAI Mission (2024) — INR 10,000+ Cr.
  • Bhashini — language-translation platform.
  • AI4Bharat at IIT Madras.
  • CDAC, IISc — research.
  • Indian AI companies: TCS, Infosys, Wipro, Mindtree, Fractal, Mu Sigma, Mathco, Tiger Analytics, LatentView, Tredence.
  • Startups: Niramai, Haptik, Yellow.ai, Sigmoid.
  • Data centres: Mumbai, Hyderabad, Chennai hubs.
  • DPDP Act 2023 governs personal data.

87.13 Ethics, Risk and Governance

TipAI ethics and governance
  • Bias and fairness — algorithmic discrimination.
  • Transparency / Explainability (XAI).
  • Accountability.
  • Privacy.
  • Safety / Alignment.
  • Job displacement.
  • Deepfakes and misinformation.
  • Frameworks: EU AI Act (2024) · NIST AI RMF · IEEE Ethically Aligned · OECD AI Principles · UNESCO AI Ethics.
  • India: NITI Aayog Responsible AI; DPDP Act.

87.15 Practice Questions

Q 01McCarthyMedium

"Artificial Intelligence" was coined at:

  • ADartmouth Conference 1956
  • BStanford 1985
  • CMIT 1968
  • DCMU 1972
View solution
Correct Option: A
McCarthy, Minsky, Shannon, Rochester — 1956.
Q 02TuringEasy

Turing Test (1950) is to determine:

  • AMachine's ability to mimic human intelligence
  • BMachine's processing speed
  • CEncryption strength
  • DHardware reliability
View solution
Correct Option: A
Imitation game.
Q 033 VsMedium

The original 3 Vs of Big Data are:

  • AVolume, Velocity, Variety
  • BVolume, Value, Velocity
  • CVeracity, Value, Variety
  • DVariety, Variance, Volume
View solution
Correct Option: A
Doug Laney (Gartner 2001).
Q 04HadoopMedium

Hadoop uses processing framework:

  • AMapReduce
  • BSQL only
  • CSOAP
  • DREST
View solution
Correct Option: A
MapReduce by Google paper; Apache Hadoop 2006.
Q 05TransformerHard

Transformer architecture (2017) is from:

  • AVaswani et al. (Google)
  • BLeCun
  • CHinton
  • DBengio
View solution
Correct Option: A
"Attention is All You Need" — Vaswani 2017.
Q 06ChatGPTEasy

ChatGPT was launched in:

  • ANovember 2022
  • B2020
  • C2024
  • D2018
View solution
Correct Option: A
OpenAI 30 Nov 2022.
Q 07SupervisedMedium

Supervised learning uses:

  • ALabelled data
  • BUnlabelled data
  • CReward signals
  • DQuantum data
View solution
Correct Option: A
Inputs + labels.
Q 08RLHard

Reinforcement learning is best illustrated by:

  • AAlphaGo
  • BDecision tree
  • Ck-Means
  • DPCA
View solution
Correct Option: A
DeepMind's RL system.
Q 09Analytics maturityMedium

"What will happen?" is which analytics?

  • ADescriptive
  • BDiagnostic
  • CPredictive
  • DPrescriptive
View solution
Correct Option: C
Predictive — forecasts the future.
Q 10India AIHard

NITI Aayog's AI strategy is themed:

  • A#AIforAll
  • BMake AI India
  • CAI Bharat
  • DSmartAI
View solution
Correct Option: A
NITI Aayog 2018 strategy.
Q 11GANHard

GANs were proposed by:

  • AIan Goodfellow
  • BHinton
  • CLeCun
  • DBengio
View solution
Correct Option: A
Ian Goodfellow, 2014.
Q 12FoundationHard

"Foundation Models" term was coined at:

  • AStanford 2021
  • BMIT 2018
  • CGoogle 2020
  • DOpenAI 2022
View solution
Correct Option: A
Stanford CRFM, 2021.
Q 13EU AI ActMedium

EU AI Act was adopted in:

  • A2024
  • B2021
  • C2018
  • D2026
View solution
Correct Option: A
EU AI Act 2024 (risk-based).
Q 14BhashiniHard

Bhashini is an Indian platform for:

  • ALanguage translation / NLP
  • BCloud computing
  • CPayments
  • DTax filing
View solution
Correct Option: A
Translation across Indian languages.
Q 15RAGHard

RAG stands for:

  • ARetrieval-Augmented Generation
  • BRandom AI Gateway
  • CRecursive AI Graph
  • DRefined Algorithmic Grouping
View solution
Correct Option: A
RAG = retrieval + LLM generation.

87.15.1 Advanced Format Questions

AR 1Assertion-ReasonHard

A: Big Data is characterised by 3 Vs.
R: Volume, Velocity, Variety (Laney, Gartner 2001).

  • ABoth true; R explains A
  • BBoth true; R does not explain A
  • CA true, R false
  • DA false, R true
View solution
Correct Option: A
S 1Statement-basedMedium

ML types: (i) Supervised. (ii) Unsupervised. (iii) Reinforcement. (iv) Self-supervised.

  • AAll four
  • B(i) and (ii) only
  • C(iii) and (iv) only
  • D(i) only
View solution
Correct Option: A
S 2Statement-basedHard

Analytics maturity (Gartner): (i) Descriptive. (ii) Diagnostic. (iii) Predictive. (iv) Prescriptive.

  • AAll four (increasing maturity)
  • B(i) and (ii) only
  • C(iii) and (iv) only
  • D(iv) only
View solution
Correct Option: A

87.16 Quick Recall

ImportantQuick recall
  • AI: Dartmouth 1956 (McCarthy); Turing 1950 imitation game.
  • AI eras: Symbolic → Winters → Expert Systems → ML → DL (2012 AlexNet) → Generative AI (2022 ChatGPT).
  • AI types: ANI/AGI/ASI; Reactive/Limited memory/ToM/Self-aware.
  • ML: Supervised · Unsupervised · Semi · Self-supervised · RL (AlphaGo) · Transfer · Federated · Active.
  • Algorithms: Regression · DT · RF · XGBoost · k-NN · NB · SVM · k-Means · PCA/tSNE/UMAP · NN · CNN · RNN/LSTM · Transformer (Vaswani 2017) · GAN (Goodfellow 2014) · Diffusion.
  • GenAI: GPT · BERT · DALL-E · ChatGPT Nov 2022 · GPT-4 · Gemini · Claude · Llama · Mistral · Sora; Indian — BharatGPT · Krutrim · Sarvam · AI4Bharat.
  • Mgmt applications: Marketing personalisation · HR · Fraud · Predictive maintenance · SCM · CX · Strategy · Cyber.
  • Big Data (Laney Gartner 2001): 3 Vs (Volume · Velocity · Variety); 5 Vs + Veracity, Value; 7 Vs + Variability, Visualisation.
  • Tech stack: Hadoop (2006) + MapReduce · HDFS · Spark · Hive · Pig · NoSQL (Mongo, Cassandra, HBase) · Kafka · Flink · cloud (Redshift, BigQuery, Synapse, Snowflake, Databricks) · data lake/lakehouse · Delta Lake.
  • Analytics maturity (Gartner): Descriptive · Diagnostic · Predictive · Prescriptive.
  • India: NITI #AIforAll 2018 · IndiaAI Mission 2024 · Bhashini · AI4Bharat · CDAC · IISc; firms — TCS · Infosys · Fractal · Mu Sigma · Tiger Analytics · LatentView.
  • Ethics: bias · XAI · accountability · privacy · safety/alignment · deepfakes; EU AI Act 2024 · NIST AI RMF · OECD · UNESCO; DPDP Act 2023.
  • Modern: GenAI · multimodal · agents · RAG · SLMs · AlphaFold · Edge AI · Nvidia H100/TPU · synthetic data · quantum ML · safety/alignment · sustainable AI.