Thursday, October 25, 2012

Asking the Right Questions: The Challenge of Big Data

I'm continuing liveblogging / note providing at the Ark conference.  This was a good session about a new topic for law firms (if not for their clients).

Maura Grossman at Wachtell is counsel at Wachtell Lipton.  Chad Ergun is Director of Global Practice Services & Business Intelligence at Gibson Dunn.  Mary Abraham is counsel at Debevoise & Plimpton.

Size matters. By renaming information overload as "big data" we imply that we actually can cope with it (Jeremy Bently).

Big Data has three widely accepted attributes; volume, velocity, variety.  Volume reaches well beyond gigabytes and terabytes to petabytes and beyond.  We're dealing with more stuff coming at us with greater speed and variety than any existing tools can handle.  The panelists added attributes of veracity and value.  Can you actually rely on large vats of data.  If you can't, do they have any value?

Maura asks if the risk of keeping this stuff around is greater than the benefit.

We've been able to manage information in excel cells.  The type of information being produced is mostly unstructured.

Unstructured information is of many types, ranging from audio and video to texts, documents, and email.  Metadata and tags can't really cope with this type of information.

Rod Smith suggests that big data is about new uses and insights into information, not new amounts.

Law firms may not be addressing big data very much at this point, but our clients certainly are.  Big data might be used, for instance, to mine help desk calls and support requests.

Productivity increases more than 5% at organizations that have implemented big data analytics.

One example of big data usage is real-time computer monitoring of live video feeds, for instance, to alert law enforcement on unusual occurances.  Another is grocery store monitoring of customers looking for food.

There is a dark side of big data, such as NSA monitoring of international calls combined with email monitoring.

Industries inquiring about big data include banking and finance companies, the service sector, government, and manufacturing.

Big data may be itself a new form of value.

Common purposes for big data in law firm include competitive intelligence related to an RFP, lateral hiring, or business development.

Ergun--you're not ready for big data if you're tracking your matter information in Excel or in Access databases.

What would happen if you could mine your matters without tracking or coding, using current technology to extract key metadata?  You could likely find related matters, types of matters, staffing models, and the like.  Through an interface you could shift through different aspects like hours / fees  / rates / leverage.

Hours breakdowns by client / industry / type of acquisition.  All information would be touchable / drillable.  

Rand Corporation indicates that eDiscovery data costs around $0.20 per GB per day to store and $18,000 per GB per day.

The question is, what decisions could we make if we had all the information we need?

When you can bring two completely separate data sources together, you can create something beautiful (like Google flu queries and CDC population data.)

Mary polled the audience for some questions or topics that could benefit from big data mashups.

Some ideas were:


  • Cost per document version
  • Matter management & pricing
  • Trends in words in email exchanges
  • Government economic data and IPO list price
  • Attributes of pitches and success rates
  • Attributes of pleadings and trials and litigation success rates
  • Bankruptcy trends and client payments


We have to think without our usual constraints.  If we had our dream data sets, which of our usual business problems could we solve?

 The movement of big data is increasing.  Clients are starting to make price adjustments based on competitor's prices.  Within two to five years there will be a lot more big data work.

One big data story (allegedly) true is that of a father who went to the drug store to complain about his daughter receiving emails and coupons related to pregnancy.  He thought it would inappropriately influence her.  A week later he called to apologize---the drug store (actually its centralized data center) had correctly figured out from the teen's purchases that she was pregnant and targeted her for things she was likely to buy.

1 comment:

Ian Campbell-iCONECT said...

This is a great article...thanks Dave. Its one thing to talk about the value to analytics on big data or reducing big data, and its another to find a product that can actually store big data for review. iCONECT with its optional Oracle back end, currently has a project with 44 BILLION pages, being accessed by about 700 attorneys. Very few (if any) other litigation review software programs can even touch that quantity. I think we all agree with the value of analytics on a large case, however, the next time a "Big Data" case comes forward, make sure to ask the hosting providor or law firm if they software they are using is actually going to work with those kinds of volumes. Ian Campbell, iconect.com