diff --git a/articles/search-with-express-and-mongodb.markdown b/articles/search-with-express-and-mongodb.markdown new file mode 100644 index 0000000..3a8e29e --- /dev/null +++ b/articles/search-with-express-and-mongodb.markdown @@ -0,0 +1,72 @@ +Title: Powerful Search with Express and MongoDB +Author: Michael Bosworth +Date: Thu Mar 15 2013 21:28:42 GMT+0000 (UTC) +Node: v0.8.22 + +There are many ways to implement search and usually there is a specific need that you trying to meet. This may not be the search you are looking for, but it is a powerful one to leverage if ever the need arise! I'm talking about compounding search terms to match documents stored in [mongoDB][]. My use case is an employee directory. I don't know the persons name, but I should be able to type the persons team and office location and be able to narrow it down. Let's get started! + +###Sample Data### + +For this example, we will use a very simple data set. We will process this data and insert it into mongo. + + + +###Building A Index Attribute### + +To accomplish a compound search we are going to leverage [mongoDB][]'s support for regex. Since there is no easy way match a pattern against all the attributes in a document, we will flatten the data into a index attribute and test for matches there. For simplicity, I am using the [mongojs][] node module. To see where this code lives in the context of Express visit the [example project][]. + + + +Once the import has run, the {"first":"John"} document in mongo should look like this. Notice the index attribute has contactinated all of John's information. + + + +###Express### + +Now that our data is in monogoDB, we create a simple express server that will accept and process our query. For now, our route returns all the data from the employees collection. + + + +###Singular Matching### + +To make our query return meaningful data, we want to take the term that comes in from the request and match that term against the index string of each document and return the results. All this requires is a minor tweak in our route. The Regex pattern is simple. Match the whole term passed. Match it globally ("g") and be INsensitive to case ("i"). + + + +This gives us some pretty powerful results. We can search for any attribute of the document and get a match. We can even use partial words and get a match! You can see how powerful this would be provided a rich realtime front end. Maybe we will tackle that in a different post. + + + +###Compound Matching### + +We are getting close. Let's take what we've built and allow for multiple terms to be matched. For this, we need a delimiter, or something to denote from the query when a new term is starting or ending. Then we know how to parse it. Because I want users to use english-like sentences in searching, I'm just going to use a space (" "). + +We are going to need to build a Regex pattern for each term that we parse, so lets make a function for building patterns. We need this function to be a little bit more robust than our single pattern match and strip white space (in the case of a double space). + + + +Before we tweak our route, we need to learn something about MongoDB. While Mongo querys are written in JSON, Mongo supports several custom operators for performing more advanced queries. For a list of those checkout the [Mongo docs][]. We are going to use the $all operator. This operator accepts an array of queries and ALL of them have to match a document in order for it to be included in the result. We will leverage this to build and array of, you guessed it, Regex patterns and match them against the index string. + + + +Search is at a whole new level. The nice thing about regex is that I can be lazy. I can just search for "mil" and narrow my results down to two employees who work in Milwaukee. If I query "mil hum", short of course for employees working in Milwaukee's Human Resources department, then I narrow my result down to one and I don't have to know the persons name. Also note that we can search for employees based on the phone or email. The algoritm gives us freedom to query the data from any angle. + + + +###Concerns### + +I'm sure red flags went off for you around the data import section. "Redundancy!", you cried! Yes, admit it, I'm not making good use of the resources on the hard disk. You'll have to weigh the pros and cons based on how big your data set is. And who knows, maybe there is a better way. + +Enjoy! + +__Boz__. + +[git]: http://git-scm.com/ +[node]: http://nodejs.org +[npm]: http://npmjs.org/ +[express]: http://github.com/visionmedia/express +[mongoDB]: http://www.mongodb.org +[Express and MongoDB]: /express-and-mongodb +[mongojs]: https://github.com/gett/mongojs.git +[example project]: https://github.com/bozzltron/search-with-express-and-mongodb +[Mongo docs]: http://docs.mongodb.org/manual/reference/operators/ \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/compound-match-route.js b/articles/search-with-express-and-mongodb/compound-match-route.js new file mode 100644 index 0000000..97f98b6 --- /dev/null +++ b/articles/search-with-express-and-mongodb/compound-match-route.js @@ -0,0 +1,23 @@ +app.get('/query/:term', function(req, res){ + + var term = req.params.term; + + // Break out all the words + var words = req.params.term.split(" "); + var patterns = []; + + // Create case insensitve patterns for each word + words.forEach(function(item){ + patterns.push(caseInsensitive(item)); + }); + + db.employees.find({index : {$all: patterns }}, function(err, results) { + if( err || !results) { + res.json([]); + } else { + res.json(results); + } + }); + + +}); \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/data.json b/articles/search-with-express-and-mongodb/data.json new file mode 100644 index 0000000..b5ff94d --- /dev/null +++ b/articles/search-with-express-and-mongodb/data.json @@ -0,0 +1,26 @@ +[ + { + "first":"John", + "last":"Doe", + "office":"Toronto", + "team":"Marketing", + "phone":"555-203-3002", + "email": "john@company.com" + }, + { + "first":"Jane", + "last":"Doe", + "office":"Milwaukee", + "team":"Human Resources", + "phone":"555-203-3007", + "email": "jane@company.com" + }, + { + "first":"Chuck", + "last":"Smith", + "office":"Milwaukee", + "team":"Development", + "phone":"555-203-3003", + "email": "chuck@company.com" + }, +] \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/express.js b/articles/search-with-express-and-mongodb/express.js new file mode 100644 index 0000000..b04fbca --- /dev/null +++ b/articles/search-with-express-and-mongodb/express.js @@ -0,0 +1,43 @@ + +/** + * Module dependencies. + */ + +var express = require('express') + , http = require('http'); + +var app = express(); + + // Mongo setup +var databaseUrl = "company"; +var collections = ["employees"] +var db = require("mongojs").connect(databaseUrl, collections); + +app.configure(function(){ + app.set('port', process.env.PORT || 3000); + app.use(express.bodyParser()); + app.use(express.methodOverride()); + app.use(app.router); +}); + +// Import the data +require('./import')(db); + +app.get('/query/:term', function(req, res){ + + var term = req.params.term; + + db.employees.find({}, function(err, results) { + if( err || !results) { + res.json([]); + } else { + res.json(results); + } + }); + + +}); + +http.createServer(app).listen(app.get('port'), function(){ + console.log("Express server listening on port " + app.get('port')); +}); \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/import.js b/articles/search-with-express-and-mongodb/import.js new file mode 100644 index 0000000..2617e59 --- /dev/null +++ b/articles/search-with-express-and-mongodb/import.js @@ -0,0 +1,35 @@ +// A function for building a search index collection for mongodb +function processData(data, db) { + + // Iterate over raw data objects + data.forEach(function(item){ + + // Instantiate our index string + var index = ""; + + // Iterate over the object keys + for (var key in item) { + if (item.hasOwnProperty(key)) { + var obj = item[key]; + + // append the value to the index + index += item[key]; + + } + } + + // Add the index attribute to the object + item.index = index; + + }); + + // Once all the data is processed lets insert it into mongo + + // Clear any existing data + db.employees.drop(); + + // Because we are passing an array of objects, + // mongo knows to create a new document for each one + db.employees.insert(data); + +} \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/john.json b/articles/search-with-express-and-mongodb/john.json new file mode 100644 index 0000000..4d2620b --- /dev/null +++ b/articles/search-with-express-and-mongodb/john.json @@ -0,0 +1,9 @@ + { + "first":"John", + "last":"Doe", + "office":"Toronto", + "team":"Marketing", + "phone":"555-203-3002", + "email": "john@company.com", + "index": "JohnDoeTorontoMarketing555-203-3002john@company.com" + } \ No newline at end of file diff --git a/articles/search-with-express-and-mongodb/mil.png b/articles/search-with-express-and-mongodb/mil.png new file mode 100644 index 0000000..326311f Binary files /dev/null and b/articles/search-with-express-and-mongodb/mil.png differ diff --git a/articles/search-with-express-and-mongodb/mil_hum.png b/articles/search-with-express-and-mongodb/mil_hum.png new file mode 100644 index 0000000..73ff7a4 Binary files /dev/null and b/articles/search-with-express-and-mongodb/mil_hum.png differ diff --git a/articles/search-with-express-and-mongodb/regex.js b/articles/search-with-express-and-mongodb/regex.js new file mode 100644 index 0000000..310be6b --- /dev/null +++ b/articles/search-with-express-and-mongodb/regex.js @@ -0,0 +1,7 @@ +// Build a regex pattern without whitespace +function caseInsensitive(keyword){ + // Trim + keyword = keyword.replace(/^\s+|\s+$/g, ''); + + return new RegExp(keyword, 'gi'); +} diff --git a/articles/search-with-express-and-mongodb/single-match-route.js b/articles/search-with-express-and-mongodb/single-match-route.js new file mode 100644 index 0000000..0a67160 --- /dev/null +++ b/articles/search-with-express-and-mongodb/single-match-route.js @@ -0,0 +1,15 @@ +app.get('/query/:term', function(req, res){ + + var term = req.params.term; + var pattern = new RegExp(term, 'gi'); + + db.employees.find({index:pattern}, function(err, results) { + if( err || !results) { + res.json([]); + } else { + res.json(results); + } + }); + + +}); \ No newline at end of file diff --git a/authors/Michael Bosworth.markdown b/authors/Michael Bosworth.markdown new file mode 100644 index 0000000..f41cd1d --- /dev/null +++ b/authors/Michael Bosworth.markdown @@ -0,0 +1,6 @@ +Email: mtbosworth@gmail.com +Github: bozzltron +Twitter: bozzltron +Homepage: http://www.balancedscale.com + +A JavaScript developer and nodejs enthustiast. \ No newline at end of file