Elasticsearch is powerful for search and easy to misconfigure. These questions check whether a candidate understands the inverted index, mappings and cluster behaviour.
Hiring a Elasticsearch developer is easy. Telling a real one from a convincing résumé is the hard part — and it’s most of what we do. These are grouped by level, because the same question that stretches a junior is a warm-up for a senior.
Junior Elasticsearch interview questions
0–2 years
Core concepts.
What is Elasticsearch used for?
A distributed search and analytics engine for full-text search, log analytics and aggregations at scale.
Thinks it’s just a NoSQL database.
What is an inverted index?
A structure mapping terms to the documents containing them, enabling fast full-text search.
Cannot explain how search is fast.
What are indices, documents and fields?
An index is a collection of JSON documents; documents have fields; roughly analogous to tables, rows and columns.
Confuses Elasticsearch structure with relational tables.
What is a mapping?
The schema defining field types and how they’re indexed/analysed; wrong mappings break search and aggregations.
Relies on dynamic mapping and gets wrong field types.
What is the difference between a term and a full-text query?
Term queries match exact values; full-text queries analyse text (tokenise, lowercase) for relevance search.
Uses a term query on analysed text and gets no matches.
What is analysis / tokenisation?
Breaking text into tokens (with lowercasing, stemming, etc.) at index and query time so searches match sensibly.
Doesn’t understand why case or punctuation affects results.
What is the difference between keyword and text fields?
text is analysed for full-text search; keyword is exact for filtering, sorting and aggregations.
Aggregates on an analysed text field and gets tokenised buckets.
How do you retrieve and search documents?
The query DSL for search and filters, plus get-by-id; results include relevance scores for full-text queries.
Fetches everything and filters in the app.
Mid-level Elasticsearch interview questions
2–5 years
Relevance and aggregations.
How does relevance scoring work?
A scoring model (BM25 by default) ranks documents by term frequency, rarity and field length; you can tune it.
Assumes results come back in insertion order.
What is the difference between a query and a filter context?
Query context scores relevance; filter context is a yes/no match that’s cacheable and faster — use filters for exact criteria.
Puts exact filters in query context and loses caching.
What are aggregations?
A framework for analytics over search results (metrics, buckets), enabling dashboards and faceting.
Pulls data out and aggregates in application code.
What are shards and replicas?
A shard is a subset of an index enabling horizontal scale; replicas are copies for availability and read throughput.
Creates a single huge shard or hundreds of tiny ones.
How do you design mappings for good search?
Choose field types deliberately, use analysers/multi-fields (text + keyword), and avoid mapping explosion.
Lets dynamic mapping create thousands of fields.
How does the bulk API help?
Batching many index/update operations in one request for far higher indexing throughput.
Indexes documents one request at a time.
How do you handle updates and versioning?
Documents are effectively reindexed on update; optimistic concurrency (versioning) prevents lost updates.
Assumes in-place partial updates are free.
What causes slow queries and how do you find them?
Expensive aggregations, wildcard/leading-wildcard queries, large result sets and poor mappings; the profile API and slow logs help.
Uses leading wildcards on huge indices.
Senior Elasticsearch interview questions
5+ years
Cluster and operations.
How do you size and design shards?
Balance shard count and size to the data and query load; too many small shards waste overhead, too few limit parallelism.
Picks shard count arbitrarily with no rationale.
How does the cluster maintain availability?
Primary and replica shards distributed across nodes, with automatic reallocation and a master managing cluster state.
Runs a single node in production.
How do you manage time-series/log data at scale?
Time-based indices with index lifecycle management to roll over, shrink and delete old data cost-effectively.
One giant ever-growing index.
How do you keep an Elasticsearch cluster healthy?
Monitor heap/GC, shard counts, and disk watermarks; avoid oversharding and mapping explosion; plan capacity.
Ignores JVM heap and disk watermarks until it fails.
When is Elasticsearch the wrong tool?
As a primary transactional datastore or for strong consistency; it’s near-real-time and eventually consistent, best alongside a source of truth.
Uses it as the system of record for critical data.
How do you reindex without downtime?
Reindex into a new index and swap via aliases so clients switch atomically.
Deletes and rebuilds an index in place, causing an outage.
How do you tune indexing vs search performance?
Adjust refresh interval and replicas during bulk loads, and design mappings/queries for the read pattern.
Leaves defaults and wonders why bulk indexing is slow.
How do you secure and operate a cluster in production?
Authentication and TLS, role-based access, snapshots for backup, and never exposing it directly to the internet.
Leaves the cluster open to the internet unauthenticated.
Build and score a full interview with our free interview scorecard tool, browse the full question hub, or see how we interview engineers.