This is a list of wordsDEMYSTIFIED

Y U NO
SEARCH ENGINE
software that builds indexes on documents
and answers queries using those indexes
A document is a collection of fields
full text
tag
numeric
geo
maps keywords to docs
Whitespace
list of words → list | of | words
Punctuation
foo-bar.baz…bag → [foo, bar, baz, bag]
Extremely common words
a | is | the | an | and | are |
as | at | be | but | by | for |
if | in | into | it | no | not |
of | on | or | such | that | their |
then | there | these | they | this | to |
was | will | with |
This is a list of words↓
list wordsReduce a word to its simplest form
running↓
runam, are, is → be
abode, abided, abidden → abide
cat, cats, cat’s, cats' → cat
| Fr | Spa | Por | Ita | ||
|---|---|---|---|---|---|
noun | ANCE | ance | anza | eza | anza |
adjective | IC | ique | ico | ico | ico |
noun | ATION | ation | ación | ação | azione |
adjective | ABLE | able | able | ável | abile |
{boy, child, baby}
{girl, child, baby}
{man, person, adult}
Similar to full-text fields but more compact
Multi-word phrases: foo bar baz
Exact phrases: "hello world"
Prefix: hel*
Or (union): hello|hallo|shalom|hola
Negation: hello -world
Specific fields: @field:hello world
Numeric range: @field:[1 10]
Geo-radius: @field:[-77 39 5 km]
Tags: @field:{tag1 | tag2}
Optional: ~bar
based on chained iterators
hello↓
read("hello")hello world↓
intersect(
read("hello"),
read("world")
)"hello world"↓
exact_intersect(
read("hello"),
read("world")
)"hello word" foo↓
intersect(
exact_intersect(
read("hello"),
read("world")
),
read("foo")
)%%Hamberders%%↓
Hamburgers
AI → I
HEOP → help
D → the
ERF → earth
primarily designed for American English names
also encodes most English words well
double encoding for a given word
likely pronunciation
optional alternative pronunciation
John → JN
Jon → JN
Jawn → JN
index split across many partitions by document ID
a partition has complete index of all its documents
query partitions concurrently and merge results
… need search coordinator


…ALWAYS