IBM Watson™ Discovery Service Ideas

We've moved...

You'll be redirected shortly, we've moved to our new idea portal:

Remove or Strip HTML tags during ingestion and/or

Affects both ingestion and document conversion / segmentation.


Benefits usage of the ingested content making it consumable in a more basic format.   Original customer use case was to not only remove / strip HTML but also segment based on HTML header level.  So this content:



My content for first document.


Is this and I really am not sure if or how to handle <b>stylistic markup</b>




And here is my second document.



Ingested content resulting in two JSON documents.   


Follow up investigation on this idea:

  • how to handle imbedded stylistic markup
  • if the documents are split, should there be some relationship kept between them and the source
  • Guest
  • Sep 5 2018