You need to log in to create posts and topics.

Using Your Vizion Elastic API With cURL

Elasticsearch operates as a REST API, and be interfaced with any number of tools that make http requests. For this guide, we will use cURL, available in the Mac or Linux console, to illustrate some basic commands. For an even easier way to practice, you can open up Kibana and use 'Dev Tools' which can be found on the left navigation bar.

Our API calls will generally have two parts:

  1. the url/endpoint, which includes the 'ES url' given to you when you created your Vizion Elastic instance as well as endpoints that tells Elasticsearch what kind of an operation you're making.
  2. the body, which uses Query DSL (Domain Specific Language). The Query DSL closely follows JSON format and allows for a huge range of control over your queries and other operations.

Creating an Index

The index is one of the most fundamental concepts of Elasticsearch. It defines a related set of data, and can be compared to a 'database' in relational databases. For example, you might create an index called 'student' for storing a dataset of student records. The majority of our searches will take place within the context of a specific index.

curl -XPUT '< Your Elasticsearch URL >/< Index name >'

You may notice that this example contains only the url/endpoint part of the request and not the Query DSL body, since this is a simple action that doesn't require much specification.

Note: You may get an error that reads something like 'no alternative certificate subject name matches target host name' You can include a -k flag after -XPUT to disable the need for SSL certificates.

Creating a Document

A document is the Elasticsearch equivalent of a 'row' in relational databases. It is represented as a JSON object of fields and values. ES automatically supports a wide array of datatypes, including strings, numbers, booleans, datetimes, arrays, nested objects. Documents need no schema and can accept whatever fields are assigned.

Here we will create a document in an index called 'person', an index which did not previously exist. No problem, Elasticsearch automatically creates it!

curl -XPUT '< Your Elasticsearch URL >/person/_doc/1'
'{
"name": "Josephina Echerson",
"nickNames": ["Josie Jo", "Raspberry Jo"]
"bornAt": 601454129,
}'

Create many documents with a JSON file

For the rest of this tutorial, it will be useful to have a decently-sized dataset to illustrate search functionality. You can follow along by first downloading this JSON file, which contains fake banking records. With a single cURL operation, we can enter the contents of this file.

curl -XPOST '< Your Elasticsearch URL >/account' -d .json

Note that through this action, we are creating an index called 'account' and entering hundreds of documents into it.

Query vs. Filter

As we've discussed, Elasticsearch accepts queries in the form of Query DSL, formatted as JSON, which allows for nested layers of specifications. In the first level of this Query DSL body, it must be declared if the search is a query or if it is a filter. The difference between the two is that a query calculates a score for each document based on the relevance to search criteria, then returns a ranked list. A filter merely filter out all documents that don't meet a certain criteria. This is a yes or no proposition, needs no ranking, and thus needs less computation.

The Match Query

Let's check out the most basic query, the match query. The query looks for the provided search term ('tyler') within the field specified ('firstname'). Notice this query uses the '_search' endpoint. We also specify that this search is taking place within the 'account' index.

curl -XGET "/acount/_search" -H 'Content-Type: application/json' -d
'{
"query : {
"match": {
"firstname" : "tyler"
}
}
}'

Match Phrase

The match query searches for individual words within a text field without regard to order. Multiple search terms may be included, but will be searched for individually, and a document may return a high score even if not all words are found. When searching for multiple words in a specific order, a match_phrase query should be used.

curl -XGET "//_search" -H 'Content-Type: application/json' -d
'{
"query": {
"match_phrase": {
"address" : "George Street"
}
}
}'

This example may not be particularly illuminating, since the dataset does not include text fields with sentance-like text. But you should note that this query would return a document containing 'George Street' in the address, but no 'St. George Place', since 'match_phrase' looks for the ENTIRE provided phrase.

Range Queries

Elasticsearch allows you to search for documents where number and date data types fall within specified ranges. For this, use a range query, specify the field we want to search and then pass an object containing the constraints. In this example, I will search for documents where the 'age' field has a value between 40 and 45 (inclusive).

curl -XGET "/account/_search" -H 'Content-Type: application/json' -d

'{

"query": {
"range": {
"age" : {
"gte": 40,
"lte": 45
}
}
}
}'

As you see, we used an object to specify the range to search for within in the 'age' field. We used 'gte' to specify 'greater than or equal to', 'lte' for 'less than or equal to', but could have also used 'gt' or 'lt' for exclusive ranges.

Wildcard/Regex Queries

Elasticsearch has built-in capabilities for wildcard searches. Simply specify that you are doing a wildcard search and use a '?' to represent any one character or '*' to represent any number of characters. For example 'the?' would match 'they, them, then, etc.' and in the example below, we will be searching an index of patients for anyone who has a 'lastname' that begins with 'Mc'.
curl -XGET "/accouont/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"wildcard": {
"lastname" : "Mc*"
}
}
}'

You have even more capabilities by using the Regexp search. Here we will search the 'email' field only for those that have 'gmail.com' accounts.

curl -XGET "/account/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"regexp": {
"email" : "@gmail.com"
}
}
}'

Note: Wildcards and regex searches can require a lot of computation. Be careful not to make them too general (such as a '*' wildcard after only a couple of letters) in large datasets or else your searches can become very very slow.

Compound Queries

Elasticsearch supports compound queries, which allow for a higher level of flexibility and complexity within your searches. The bool query allows you to serch for documents that satisfy a combination of requirements. For example, the following will search for an account from the state of Kentucky with a balance of between 20,000 and 30,000.

curl -XGET "/account/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"bool": {
"must" : [
"match": {
"state" : "KY"
},
"range": {
"gte": 20000,
"lte": 30000
}
],
must_not: {
"match": {
"employer":"Ezentia"
}
}
}
}
}'

Notice the use of 'must' and 'must_not' clauses above. They work as you might expect and only return results that satisfy the criteria nested within. Also available is 'should' which boosts the scores of documents that meet the criteria but does not explicitly require them. Also note that within any type of bool clause, you may pass in a single object (as in the case of 'must_not' above) or an array (as in the case of 'must' above)

Source Filtering

You may have noticed that the results of your queries contain a '\_source' object, which contains the data put into that document at creation/update. When searching, you can specify a set of fields for this object to include by passing in a '\_source' clause with an array of the desired fields. Here is a query that will return just the first and last names of all accounts with a balance above 40000.
curl -XGET "/account/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"_source": ["firstname", "lastname"],
"range": {
"balance": {
"gt": 40000
}
}
}
}'

Pagination

When expecting large numbers of results, you will probably want to use pagination to work with those results in manageable chunks. Using the 'size' clause in your search will specify how many results to return. Note that 'size' **defaults to 10, meaning you will see only the top 10 results by default.** The 'from' clause specifies where to start returning results, with the first result numbered 0 (the default if you don't include a 'from').
curl -XGET "< ES URL >/book/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"size": 20,
"from": 20,
"match": {
"state": "CA"
}
}
}'

Above is an example of a search for accounts from California. The 'size' clause specifies that a maximum of 20 documents will be returned, starting at the beginning of the second "page" (matches 0-19 making up the first page).

Sorting

With the sort clause, you can specify a field to sort by and whether to sort it ascending or descending. Here we will search for accounts from the state of Oregon, with the results to be sorted by balance in descending order.
curl -XGET "< ES URL >/account/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"sort": {
"balance": "desc"
}
"match": {
"state": "OR"
}
}
}'

Aggregations

Elasticsearch has some useful tools for aggregating numeric fields. An 'aggs' query can return the 'min', 'max', 'avg', 'sum', and 'count' of a numeric field of a set of documents. Below we will find the low, high, and average account sizes.
curl -XGET "< ES URL >/test_scores/_search" -H 'Content-Type: application/json' -d
'{
"query": {
"size": 0,
"aggs": {
"low_balance": {
"min": {
"field": "balance"
}
},
"high_balance": {
"max": {
"field": "balance"
}
},
"average_balance": {
"avg": {
"field": "balance"
}
},
"balance_total": {
"sum": {
"field": "balance"
}
}
}
}'

Note that the fields 'min_balance', 'max_balance', etc. are arbitrary names that we have assigned to each aggregation. This is what that agg will be labeled in the results. Note also that we have set 'size' to 0. This way the search doesn't return any documents themselves, only the aggregations.