Big Data and Solr

Using Solr with Big Data

Course Overview

This two-day training course is designed for Solr developers who want to:

  • Learn how to use key open source tools such as Hadoop, Cascading and Mahout.
  • Process big data using workflows to generate large search indexes.
  • Use Solr 4 as a scalable NoSQL database and analytics engine.

The twenty modules and five labs teach participants about the essential technologies for generating both traditional Solr search indexes and NoSQL/analytics "databases" using Hadoop, Cascading, Cassandra, Mahout and Storm. By the end of the class, students will understand both good and bad use cases for these popular "big data" technologies, and how they can be used to create larger, more sophisticated search and analytics solutions based on Solr.

Class Schedule and Registration

More classes being scheduled soon!

Who Should Attend?

The class is for Solr developers who want to know how to leverage the flexible search functionality of Apache Solr and the Big Data processing of Apache Hadoop, to create the indexes for both general search and augmented data analytics. Lab exercises and real-world examples will be used to reinforce content.

Prerequisites

To get the most from this course you should have experience developing developing Solr applications and with Java development. We also recommend completing "Solr Unleashed" and relevant work experience before taking this class.

Day 1 Day 2

Introductions

  • Real-world example of a big data problem
  • Real-world solution using Hadoop & Solr
  • Hadoop overview
  • Hadoop distributed file system (HDFS)
  • Map-reduce
  • Writing Hadoop jobs
  • Map-reduce Lab (30 minutes)
  • Map-reduce Lab review
  • Hadoop summary
  • Cloud computing with Amazon Web Services
  • Workflows with Cascading
  • NoSQL with Cassandra
  • Continuous processing with Storm
  • Machine learning with Mahout
  • Mahout Lab
  • Mahout Lab review

Big Search

  • Why use Hadoop with Solr?
  • Designing workflows
  • Moving Big Data
  • Scalable Solr indexing
  • Solr indexing Lab
  • Solr indexing Lab review
  • Complex Solr search example
  • Solr as a NoSQL engine
  • Solr-based analytics
  • Solr analytics Lab
  • Solr analytics Lab review
  • Massive Solr index example
  • Solr URL Lab (45 min)

 

Format

Instructor-led lectures with hands-on lab exercises, examples & demonstrations.

Course Materials

Participants will receive an electronic copy of all slides and handouts, as well as links to other resources and downloads.

More information

If you have questions about this or any other LucidWorks University class, please contact the LucidWorks University team.

Cancellation Policy

Registration for a class can be cancelled up to 14 calendar days in advance of the class date for either a full refund, or credit towards another class. No credit or refund can be given for no-shows, or class registrations cancelled less than 14 calendar days prior to a class date. If a registered participant is unable to attend the course, a substitute is welcome to take their place.

On occasion, LucidWorks has to cancel or reschedule a delivery. If this happens, we will notify you as far in advance of the scheduled course dates as possible. In the event that a course is cancelled, the liability of LucidWorks is limited to the return of paid registration fees.