IBM Watson™ Discovery Service Ideas

We've moved...

You'll be redirected shortly, we've moved to our new idea portal: https://ibm-watson.ideas.aha.io

Support for additional DOCUMENT TYPES in Watson Discovery

We're dealing with many different document types and our ability to focus on Watson is limited but its support for only basic document/artifact types in Watson Discovery document injestion.

We encounter many but by percentage here are our top 10:

PDF, HTML, DOC(X), PPT(X)(S), XLS(X), JSON, TXT, RTF, CSV, EPUB, 

We have seen ODT, ODP, ODS, TEX and their relatives mostly when we encounter government clients as well.

While we don't expect Watson to specifically deal with ZIP files it would be nice to have a simple way to package and minimize the size/time/cost of the transfer of artifacts if possible along with other file compression formats.

Eventually we fully expect to encounter more and we want to minimize our efforts, costs and transforms in analyzing them through Watson along with potential for OCR.

  • Guest
  • Dec 12 2017
  • Attach files
  • Guest commented
    December 12, 2017 05:34

    The most annoying one is that TEXT files are not a supported document type, considering how those are the easiest files to read.

    My customer has these unsupported formats:

    Plain text (txt)

    MS Outlook messages and templates (msg, olt)

    Excel spreadsheets (xlsx, xls)

  • Vijay Gupta commented
    May 16, 2018 15:56

    VMWare is looking to zip up all their JSON files and upload it to discovery. They don't have to deal with all the issues of uploading, retrying, keeping track of the individual files that need to be done otherwise with a sophisticated scripts. Customer Request.