Amazon OpenSearch Service – A Comprehensive Guide
Like many organizations, yours may be facing challenges when seeking secure, scalable, and high-performing search and analytics solutions. Indeed, the pursuit of making the most of the available data is often hindered by operational complexities. Be it data security or scalability issues, this is where Amazon OpenSearch Service comes into play.
Should you be new to Amazon OpenSearch Service or quite a confident user already, this all-encompassing article will guide you in the intricacies of data-driven approach and analytics. The article will give you a thorough overview on:
- What OpenSearch Service is and how it relates to OpenSearch
- How Amazon OpenSearch operates and what functions covers
- What the typical use cases for OpenSearch Service are
- What benefits OpenSearch Service provides and at what cost
Table of Contents
What is Amazon OpenSearch Service?
In order to ensure seamless business operations, IT teams are to process an unprecedented volume of information, including log data, security data, and application performance data. Information plays an equally significant role in fueling search applications as well. While Amazon OpenSearch Service stands out as a robust and widely utilized solution for handling these demands, there is untapped potential for numerous organizations to extract even greater value from this it.
Some ways of making practical use of OpenSearch Service are described below.
Simplifying AWS cloud operations
Amazon OpenSearch Service seamlessly integrates with AWS services and offers the flexibility to choose between open-source engines like OpenSearch and ALv2 Elasticsearch. Embracing OpenSearch Service eliminates the complexities associated with managing OpenSearch and legacy Elasticsearch clusters in the AWS Cloud since it takes charge of administrative tasks: from provisioning infrastructure to software installation.
A managed solution for data power
As a managed platform, OpenSearch Service streamlines various data-centric tasks, such as website searches, interactive log analysis, and real-time application monitoring. Built on the open-source OpenSearch platform, it empowers you to explore, visualize, and analyze vast amounts of unstructured data, scaling up to petabytes of volume.
Advanced analytics suite
OpenSearch Service is your analytics suite for interactive log analytics, real-time application monitoring, and web search, featuring the latest OpenSearch versions and support for 19 Elasticsearch versions ranging from 1.5 to 7.10. It also boasts visualization capabilities through OpenSearch Dashboards and Kibana versions from 1.5 to 7.10.
Hassle-free resource provisioning
The Service not only provisions resources for your OpenSearch cluster but also automates the detection and replacement of failed nodes, alleviating the burdens of self-managed infrastructures. As to scaling your cluster, a single API call or a few clicks in the console suffice to do the job.
What is Amazon OpenSearch Serverless?
OpenSearch Service also offers a serverless option for running search and analytics workloads in order to tackle all your infrastructure concerns. This serverless feature is perfect for handling sporadic or unpredictable workloads which will spare you the need to consider cluster sizing, monitoring, or fine-tuning.
The service tracks vital metrics, such as CPU usage, disk capacity, memory, and shard status. Should these thresholds be exceeded, the system automatically adjusts capacity without requiring any manual intervention. In OpenSearch Serverless, storage and compute function independently, allowing separate scaling, which prevents a number of possible challenges.
As of now, though, OpenSearch Serverless does not provide support for advanced OpenSearch Service functionalities, such as alerting, anomaly detection, and k-NN. If those are on your list of priorities, you can leverage managed clusters to access them until they are integrated into the serverless option.
For situations demanding precise cluster configuration or tailored adjustments, opting for provisioned clusters proves commonsensical. Managed clusters offer the flexibility to select your desired instances and versions, offering you enhanced control over configurations like refresh intervals or data-sharding strategies. These features can be crucially important for use cases that deviate from the standard patterns that OpenSearch Serverless is able to address.
How Amazon OpenSearch Service relates to OpenSearch
As a distributed, community-driven, Apache 2.0-licensed, entirely open-source search and analytics suite, OpenSearch finds utility across a wide spectrum of applications and caters to diverse use cases, excelling in real-time application monitoring, log analytics, and website search. The platform allows exceptional scalability, ensuring rapid access and responses to vast data volumes. Integrated with OpenSearch Dashboards, it empowers effortless data exploration with a set of visualization tools. Since OpenSearch runs on the Apache Lucene search library, it has a rich array of search and analytics capabilities, including k-nearest neighbors (KNN) search, SQL functionality, Anomaly Detection, Machine Learning Commons, Trace Analytics, and comprehensive full-text search to name just a few.
On the other hand, Amazon OpenSearch Service is a closely aligned managed offering linked to the open-source search and analytics framework, formerly recognized as Elasticsearch and now known as OpenSearch. Further on you can examine how exactly Amazon OpenSearch Service is connected to OpenSearch.
- Core engine and compatibility:
- Shared Roots: While both Amazon OpenSearch Service and OpenSearch are built on a common core search engine, the former relies on the latter as the underlying technology to power its search and analytics capabilities.
- Compatibility: Queries and indices used in OpenSearch seamlessly align with Amazon OpenSearch Service. This ensures that applications, scripts, and configurations originally designed for OpenSearch can typically be employed with minimal adjustments on Amazon OpenSearch Service, and the reverse holds true as well.
- Service management:
- Managed by AWS: Amazon OpenSearch Service is a fully managed service. It means that AWS handles everything from infrastructure provisioning to ongoing maintenance. This effectively frees users from the operational complexities associated with self-hosting OpenSearch.
- Open source alternative: In contrast, OpenSearch is open-source software, implying that organizations are responsible for establishing and overseeing their own OpenSearch clusters on their preferred infrastructure. This entails a greater demand for expertise and effort, especially for cluster management and upkeep.
- Integration and additional features:
- AWS ecosystem integration: Amazon OpenSearch Service provides seamless integration within the expansive AWS ecosystem, enabling users to utilize various AWS services, such as Amazon CloudWatch for efficient monitoring and Amazon S3 for robust data storage solutions.
- Extended functionality: Amazon OpenSearch Service goes beyond the default offerings of the open-source OpenSearch. It incorporates additional features and capabilities tailored to elevate the service, particularly well-suited for enterprise-level apps.
- Version control:
- AWS versioning: Since Amazon OpenSearch Service takes charge of version control for OpenSearch, guaranteeing compatibility and security, it is AWS that oversees the management of versions in the service.
- Open-Source Community: Conversely, OpenSearch depends on decisions made by the open-source community for its development and versioning. OpenSearch users enjoy more autonomy in choosing their preferred version but must also handle updates and compatibility matters independently.
- Security and compliance:
- AWS security measures: Amazon OpenSearch Service is equipped with security measures designed specifically for AWS, featuring integration with AWS Identity and Access Management (IAM) and other AWS security services. Additionally, it obtained compliance certifications, thus it can serve various industries and regulatory needs.
- Open source security: While OpenSearch does offer security features, their effectiveness and the attainable compliance certifications depend on how users configure and manage the system themselves.
Having taken those points into account, one concludes that Amazon OpenSearch Service is a managed solution built upon the open-source OpenSearch project. The Service streamlines the deployment, operation, and scaling of OpenSearch clusters, presenting an appealing option for organizations seeking the benefits of OpenSearch without the associated operational complexities. While OpenSearch forms the open-source foundation, Amazon OpenSearch Service enriches the platform with AWS-specific enhancements, integration, and proficient management.
Benefits of OpenSearch Service
This chapter delves deep into the heart of Amazon OpenSearch Service, uncovering the myriad advantages it brings to the table. You’ll discover how this powerful tool empowers organizations to make data-driven decisions, accelerate innovation, and ultimately gain a competitive edge in today’s data-centric landscape.
Semi-structured and unstructured data search
OpenSearch Service facilitates retrieval of products, services, and documents from semi-structured and unstructured data with a number of functionalities for tailoring your search experience, including but not limited to full-text queries, autocomplete, scroll search, customizable scoring, and ranking.
High scalability and availability
An impressive 3 petabytes (PB) of data can be amassed within a single cluster on the OpenSearch Service, and a cluster is easily resized either through your computer or an API call. With an additional feature of cross-cluster search, you can extend your queries across 20 clusters in a unified search and analyze all your log data in a single interface. The system is engineered for reliable multi-availability zone deployments: data replication is facilitated across three Availability Zones within the same Region.
OpenSearch Service’s Trace Analytics feature monitors the progress of requests as they spread through distributed systems, which facilitates issue detection and resolution. Different monitoring needs will be supported since the Service adheres to the OpenTelemetry standard for the intake of trace and log data.
Diagnosis of infrastructure issues
OpenSearch Service empowers you to detect, analyze, and resolve issues within your infrastructure and AWS services, offering a streamlined approach to identifying and treating problems. Machine-learning anomaly detection automatically spots anomalies during data ingestion with the Random Cut Forest (RCF) algorithm. This functionality can be seamlessly integrated with alerting to enable near-real-time data monitoring and automated alert notifications – all to improve your app’s well-being.
In OpenSearch Service, the UltraWarm and cold storage tiers offer cost-effective solutions in contrast to using the hot storage tier. The service also provides access to advanced features without incurring extra licensing fees. Such an approach not only saves expenses but also eliminates the necessity to get a team of specialists to oversee data clusters.
OpenSearch Service’s key security elements include encryption at rest, encryption in transit, and granular access control. Management APIs for essential operations, like domain creation and scaling, are governed by AWS Identity and Access Management (IAM) policies, enhancing security and access control measures.
For those concerned with their app’s utmost security, OpenSearch Service’s security features are compliant with the Health Insurance Portability and Accountability Act (HIPAA). Apart from organisations working with healthcare, the Service assists in meeting compliance requirements for the Payment Card Industry Data Security Standard (PCI DSS), Security Operations Center (SOC), International Organization for Standardization (ISO), and Federal Risk and Authorization Management Program (FedRAMP) standards.
OpenSearch Service pricing
Typically for AWS products, OpenSearch Service follows a pay-as-you-go model: no minimal fee or upfront costs incurred. There are three categories for OpenSearch Service pricing, and you can explore more detailed information regarding OpenSearch Service pricing further on.
The pricing structure is influenced by the specific compute instance type and number that you choose to use in your data cluster.
Costs are incurred based on the Amazon Elastic Block Store (Amazon EBS) storage type and the capacity you need for your information volume. If you opt for Provisioned IOPS (SSD) storage, you will incur charges for the storage itself as well as the throughput you provision; you will not be billed for the I/Os you actually use, though.
AWS data transfer charges
For data transferred in and out of OpenSearch Service, you are subject to standard AWS data transfer fees. That being said, you will not be billed for data transfer between nodes within your OpenSearch Service domain.
Should you require a more detailed overview of the pricing tiers and options, do not hesitate to visit Amazon’s website (https://aws.amazon.com/opensearch-service/pricing/).
Typical use cases for OpenSearch Service
Current market conditions make organizations care about fast search experiences, elevating user interaction, and increasing conversion rates. An answer to these requirements is a reliable, versatile, and resilient full-text search engine. Huge amounts of data drive log analytics, and businesses seek to prevent suspicious actions, predict imminent critical system occurrences, and accelerate in-depth root cause analysis. Combining these two needs in a consolidated platform capable of seamlessly digesting log, trace, and metric data is a long-awaited answer to many teams and enterprises.
The DevOps teams are under pressure to achieve observability with apps’ log, trace, and metric data, all the while focusing on root cause analysis and anomaly detection. This challenge further extends to developers, who are tasked with crafting customized integrations, which entails time-consuming implementation and troubleshooting.
Moving on to customers, the management of open-source search engines like Elasticsearch and OpenSearch imposes a few responsibilities on them, as well. These encompass cluster administration, capacity sizing, scalability adjustments, optimization, patch application, and hardware oversight.
Leveraging the fully managed OpenSearch Service liberates various teams from their own sets of mundane tasks. All in all, use cases for Amazon OpenSearch Service fall into two groups, represented in detail in two subsections below.
Search use cases with Amazon OpenSearch Service
A search feature in a website or app empowers users to effortlessly locate what they are seeking by simply typing into a search bar. While this sounds pretty straightforward, the challenge lies in ensuring that the search system can operate rapidly enough during peak traffic.
Given that open-source search engines provide high-speed full-text search capabilities, it is the associated infrastructure management that can be time-consuming: activities like scaling servers for varying processing demands, cluster management, optimization, patching, and hardware upkeep steal the time and focus of engineers from app development.
The need to cater to complex operational management is met by Amazon OpenSearch Service, often at a lower cost than maintaining on-premises infrastructure. The Service’s machine learning (ML) algorithms rank results and present users with the most pertinent resources.
Web search at any scale
In terms of websites and blogs, implementing a search function is a common necessity. User’s location of articles, posts, or any relevant information on your platform – be it an application, a website, or a data lake catalog – can be facilitated by the solution offered by OpenSearch Service. The solution meets more sophisticated users’ search needs as well, such as natural language search, auto-completion, faceted search, and location-aware search.
E-commerce platforms appreciate OpenSearch Service’s rapid and effective product search delivery, which influences the users’ ability to swiftly find the desired items directly. The shopping experience becomes ultimately satisfying with OpenSearch Service features of auto-suggestions, filters, and faceted search.
It is not only for users that the OpenSearch Service makes a difference: organization employees can use the Service’s features for internal knowledge exploration to quickly access the necessary information within documents, emails, and other company data, thus promoting efficient knowledge discovery.
If you are aiming to empower your customers to effortlessly locate their desired information, incorporating fast and scalable full-text search features provided by Amazon OpenSearch will do the trick. The Service facilitates the integration of search functionality into database-backed applications, whereby the search engine mirrors the database content and utilizes machine learning (ML) for ranking, ensuring the delivery of relevant results and elevating the interactivity of customer experiences.
As for those mobile applications that prioritize user engagement in content consumption, such as news apps or social media platforms, OpenSearch Service integration allows users to swiftly uncover relevant content, which encourages user retention. Besides, apps dependent on location data, like mapping or travel apps, can leverage OpenSearch Service to offer geospatial search functionality, simplifying location-based searches. This equips users to explore nearby points of interest, businesses, or specific destinations.
Application owners with distinctive needs embrace OpenSearch Service’s extensive customizability. Whether the requirements involve crafting custom filters, fine-tuning search algorithms, or implementing other specialized functionalities, OpenSearch Service provides the adaptability needed to cater to specific use cases.
When it comes to fostering internal knowledge discovery, enterprises now have the option to integrate OpenSearch Service into their internal applications, enabling their employees to swiftly search for documents, reports, and various internal knowledge assets. This implementation expedites the retrieval of information and boosts informed decision-making.
Streaming data use cases with Amazon OpenSearch Service
Whichever environments you monitor: containers, microservices, or applications, three foundational observability signals are unchangeable: metrics, logs, and traces.
The objective of observability signal collection is to offer a unified experience for DevOps and Site Reliability Engineers, enabling them to pinpoint crucial events and utilize all observability cues to isolate problems within containerized applications and microservices deployed across diverse environments. In order to help teams achieve this, Amazon OpenSearch Service seamlessly merges both log and trace data analytics into a single, comprehensive solution.
Originating from a multitude of sources, streaming data flows incessantly and encompasses various types: from log files generated by users of an app to telemetry from connected devices or sensors in data centers. Amazon OpenSearch Service is well-prepared to manage streaming such diverse data.
In dynamic and mixed environments, developers often lack the bandwidth to construct and debug customized integrations for amalgamating data from multiple sources, which can be time-consuming in terms of both implementation and troubleshooting. To accommodate this need, Amazon OpenSearch Service offers built-in integrations with Amazon S3, Amazon Kinesis Data Firehose, Amazon CloudWatch, Amazon DynamoDB, Amazon SageMaker, and AWS Key Management Service (KMS), with an eye on simplifying the process and saving valuable time.
Centralized log analytics
The multitude of logs generated by your applications, devices, and machinery at large scale and high speed often makes it difficult to keep up with and filter the data to identify meaningful insights. One of the challenges lies in maintaining control over diverse log sources and having the capability to anticipate system issues based on error messages.
Regardless of the origin of the log data, yet especially useful for the large information volumes generated by Internet-of-Things (IoT) devices, a cost-effective, secure, scalable, and adaptable solution becomes a pertinent need. Amazon OpenSearch Service consolidates log events into a unified perspective and offers a ready-made environment for initiating the analysis of log patterns thus enhancing the management of your systems.
As apps grow more intricate and interconnected, regular updates of numerous components are likely to introduce failure scenarios. Preventing those requires more than mere monitoring of resource utilization and network system status. One needs not just to comprehend the ongoing events but also to proactively resolve potential problems.
A structured approach to comprehending the dynamics of complex systems is offered by observability and application performance monitoring (APM) tools. The toolset empowers developers and operators alike to define system behavior through the observation of external outcomes, in the end facilitating innovation and strengthening app reliability.
Through the analysis of the three fundamental observability signals (metrics, logs, and traces), engineers can effectively identify crucial events and issues within containerized applications and microservices, regardless of their deployment location.
Trace analytics with OpenTelemetry
Trace analytics, when combined with log data, serve a dual purpose: pinpointing the origin of performance issues and diagnosing their underlying causes. It is worth noting, though, that the task of correlating trace data with log events often involves navigating multiple interfaces. Another complication is that developers are supposed to be qualified specifically in creating visualization to construct monitoring views based on log data.
The good news is that this process can be simplified with Amazon OpenSearch Service. The latter enables the seamless analysis of both traces and logs through a single interface, streamlining the process of identifying and resolving performance issues within distributed apps. This, in turn, provides developers and DevOps engineers with insights into their application’s performance, allowing them to handle tuning and debugging.
Log analytics with open-source
It is reasonable to prioritize log analytics: it plays a pivotal role in safeguarding companies against risks by guaranteeing adherence to security and industry regulations as well as helps to enhance the user experience by detecting performance concerns. Still, the process of collecting application and infrastructure logs from different data origins and preparing them for analysis is rather tedious and time-consuming.
Many businesses now choose open-source log analytics solutions over proprietary software due to their perception of being more cost-effective, secure, and stable. An example of such an open-source log analytics suite, OpenSearch facilitates the ingestion of log data into your Amazon OpenSearch Service domain by employing open-source collectors and aggregators. These logs can originate from various sources, including application and infrastructure logs, security logs, AWS service logs, application trace logs, and application or infrastructure metrics.
For the collection and aggregation of log data, one can use open-source systems like Beats, FluentBit, Fluentd, and Data Prepper. Having been refined, the data can be loaded into Amazon OpenSearch Service and subsequently visualized and analyzed using OpenSearch Dashboards.
Intending to ensure the robustness of their IT systems, security teams are to sift through a backlog of security alerts. What can be done about this challenge is resorting to a security information event management (SIEM) solution. One of those is Amazon OpenSearch Service which simplifies the task for SecOps teams in handling security and event data by consolidating and examining logs across applications and systems, as well as aids with real-time threat detection and incident management.
One practical example of Amazon OpenSearch Service’s features in use refers to rapidly indexing, searching, and visualizing logs sourced from routers, applications, and other devices. This capability facilitates security threat identification and mitigation, the threats including as many different cases as data breaches, unauthorized login attempts, distributed denial of service (DDoS) attacks, and fraudulent activities.
An SIEM function of Amazon OpenSearch Service can be further strengthened when incorporated with AWS Security Hub to extend the retention period of findings beyond what Security Hub offers. This integration also fosters the consolidation of findings across multiple administrator accounts and enhances the correlation of Security Hub findings with each other and additional log sources.
Amazon OpenSearch FAQ
The core components of OpenSearch Serverless are driven by the open-source OpenSearch project, encompassing a search engine known as OpenSearch and a visualization interface, OpenSearch Dashboards.
Amazon OpenSearch Service domains are essentially Elasticsearch clusters spanning versions 1.5 to 7.10, or OpenSearch clusters, which can be established via the Amazon OpenSearch Service console, CLI, or API. Each domain represents a cloud-based OpenSearch or Elasticsearch cluster and can be configured with the specified compute and storage resources.
Amazon OpenSearch Service allows a substantial level of customization. It is you as the user who decides to create or remove domains, define infrastructure characteristics, and manage access and security. You are also at liberty to choose whether to operate one or multiple Amazon OpenSearch Service domains as needed.
Amazon OpenSearch Service takes care of the tasks associated with establishing a domain, beginning with allocating infrastructure capacity in the specified network environment and proceeding to the installation of OpenSearch or Elasticsearch software.
Once your domain is operational, Amazon OpenSearch Service automates routine administrative activities, such as backup procedures, instance monitoring, and software patching, to relieve you of the maintenance burden. The Service can also be seamlessly integrated with Amazon CloudWatch, enabling the generation of metrics that furnish insights into the status of the domains. As per customization, Amazon OpenSearch Service provides choices for adjusting your domain instance and storage configurations to tailor your domain to meet the requirements of your specific applications.
Established in 2021 with a mission of serving as a secure, high-quality, fully open-source search and analytics suite, the Amazon OpenSearch project signifies a community-driven, open-source fork of Elasticsearch and Kibana that is an outlet for Amazon’s investments into reliable and innovative solutions.
The project encompasses OpenSearch (which is derived from Elasticsearch 7.10.2) and OpenSearch Dashboards (derived from Kibana 7.10.2). As the very first version of OpenSearch was introduced on July 12, 2021, Amazon incorporated support for OpenSearch 1.0 into the managed service as of September 7, 2021, along with transitioning from Amazon Elasticsearch Service to Amazon OpenSearch Service as the service name. Still, Amazon maintains support for legacy Elasticsearch versions up to 7.10 alongside OpenSearch 1.0. The change in the name does not impact the ongoing operations, development processes, or business use.