In a May, 2014 article published in The New York Review of Books, David Cole quoted the former CIA Director, General Michael Hayden as saying, “We kill people based on metadata.” This was in reference to the bulk collection of phone records by the National Security Agency. In response to American citizens uproar over the collection of their phone records, the security agency’s argument was that they did not listen in on the calls but only collected the metadata. Metadata has been defined as the “data about data.” The metadata that the security agency collected was phone numbers, time of calls, what numbers were called, duration of calls, etc. With such information, they could construct a structure that leads them to the potential terrorist. That is how valuable metadata is to the security agencies.
Metadata drives our technological world. It is amazing how often we interact with metadata in the course of our day: whether it is at the bank, the dreadful DMV, doing a simple web search on google or trying to access a database. The information generated by these technological systems is structured by metadata.
In my last blog post, I wrote a review of Music Online: African American Music Reference. In this post, I want to review the metadata used in this database. When you conduct a search with key words, hovering your cursor over the image would show you why you received that particular result. For example, I did a search for “civil rights”. The first result was “Songs of Protest and Civil Rights” written by Jerry Silverman. Hovering my cursor over the image showed this string: “Songs of Protest and <em>Civil</em>Rights</em>. The set of keywords that tag this resource is “Civil” and “Rights.” That is the metadata.
The metadata on this website is structured and so the information that back the resources are not just used randomly in the database. There are several categories that are used throughout this database to structure the data. These are subject, topic/theme, field of study, content type, author/creator, person discussed, etc. This is valuable information to researchers. The metadata allows the researcher to query the database on resources associated with a particular theme such as the “civil rights” and even more specifically lyrics from civil rights music created by a particular individual. The results received are more accurate with the proper metadata. Though this database is not case sensitive, it is important to have the spellings correct and have the words in a proper order. For example, “civil rights” produces better search results than “rights civil.” The reason is because the metadata is created in a way that reflects how we generally use language. Even though the two words are the same, the metadata is structured as “civil rights” and not as “rights civil.”
The metadata of the database has some deficiencies when it comes to using wildcard characters such as asterisks * for partial matching. The database does not have this functionality. Using it does not produce any desired results. It is also not able to use “or” and “not” to return appropriate results. For example, I searched for “Civil or Uncivil rights.” None of the results returned anything on civil rights. The very first result was on the civil war and the second result was on civil Islam. It is important for users to be aware that the default operator for this database is “And.”
On the whole, using the proper metadata, the database is able to return very good results.