VR Communications LLC

Search
Skip to content
  • About VR Communications
  • About Us
  • Experience and Projects
  • Portfolio
  • Clients
  • Blog/post home pages
    • Building bot-ready knowledge bases
    • My high-tech adventure… continued
    • Communication, or MIScommunication?
    • Computer history
    • My high-tech adventure… original
communication, education, linguistics, technology

Building bot-ready knowledge bases #10: Using DITA/XML metadata as a bot training kit

March 23, 2020 Anna

Our experimental initiative to prototype a bot-ready information solution using Google’s Dialogflow

This post is part of a series. For more information and links to other posts in the series, see the “Building bot-ready knowledge bases” home page.

Our transition to programmatic solutions

In our “bot-ready” projects and series of posts about them,  we’ve started transitioning our focus from improving and “botifying” our content to experimenting with programmatic solutions and application integration. Dick’s most recent post on web hooks, #9 in the series, is an example of that change in emphasis.

However, before we moved away from GROCERYbot and its knowledge base, we wanted to do the following:

  • Add training phrases to the DITA/XML grocery shopping files
  • Review “grocery shopping” as a suitable model for more advanced projects

Adding training phrases to the DITA/XML grocery shopping files

The last additions we made to our grocery shopping files were sample user queries. Shown in the following image, these serve both as a checklist of the article content (“Does the article answer the most likely questions users might have about the topic?”) as well as training phrases for the bot.

We labeled the questions as “faqs” (“frequently asked questions”).

Dialogflow has the capability of automatically converting these FAQs into intents (which, in GROCERYbot, we created laboriously by hand), and we intend to experiment with this capability in the near future.

Training phrases in a grocery shopping input file
Training phrases in a grocery shopping input file

Reviewing “grocery shopping” as a suitable model for more advanced projects

The following image shows the metadata in the final version of “Produce: Overview,” as viewed in the published HTML file.

XHTML output file for "Produce: Overview", accessed with "view source"
XHTML output file for “Produce: Overview”, accessed with “view source”

These items are not directly available to either a person or a bot viewing only the published content, but they can easily be accessed programmatically. Some of the more useful metadata elements in a bot environment are the following:

  • Title and abstract/description

The information contained in these elements could be helpful to users as “teaser text” included in a set of bot-suggested articles to help users make an educated guess about the articles’ potential usefulness.

  • Items labeled as “DC.subject” and “keywords” (typical keyword tags associated with content to identify important topics)

Items with a DC prefix are part of the Dublin Core Schema, a “recognized” set of vocabulary terms that can be used to describe digital resources.

  • Content creator and contributor
  • Dates when the article was created, updated, and is due to expire
  • Training phrases mentioned above and labeled as “faq”
  • Language the article is written in

Having this information assembled as a kind of “bot training kit” is particularly useful because it allows the creators, editors, and owners of the content (the “content team”) to focus the bot’s attention on the information they consider to be most important to them and their users. A bot could not “intuit” all of this information from the KB article content alone, and no bot could create an abstract or short description to match the quality of one created by the article’s author.

In addition, it allows the content team greater control over how the information is displayed to the user. For example, the bot could be directed to show only the short description text as a “teaser,” rather than allowing the bot to select or create its own text for that purpose.

If appropriate, depending on various training model and various NLP (natural language processing) factors, the article text could also be added to the kit.

So, yes, we believe our grocery shopping project can serve as a model for us to build on, and we are moving on to more interesting and potentially useful projects.

What’s next?

Our most recent Dialogflow experiments involve integration with apps like Telegram and Slack, and the “bot training kit” described above is playing a key role.

Dick has written a Python script to pull key metadata out of the grocery shopping HTML output files and populate a CSV file, which he is using as a knowledge base connector to a Dialogflow-based, follow-on version of GROCERYbot. In addition, he integrated “GROCERYbot2” with the Telegram instant messaging app, so users can take advantage of Telegram’s advanced text and voice platform to access grocery shopping information.

We will publish more about our amazing results in the next post.

0botreadykbs2020artificial intelligence (AI)chatbotDialogflowDITA/XMLDublin Core Schemagrocery shopping projectGROCERYbotmetadataTelegram instant messaging application

Post navigation

Previous PostBuilding bot-ready knowledge bases #9: Adding a webhook to our GROCERYbot projectNext PostBuilding bot-ready knowledge bases #11: Integration with the Telegram instant messaging service

Project: Building bot-ready knowledge bases (2018-2020)

A VR Communications experimental initiative to prototype a bot-ready information solution using Google’s Dialogflow.

A synergistic approach to AI information systems using structured content and chatbot technologies.

Examples of how meaningfully annotated knowledge base (KB) articles, preferably by their authors, can increase the effectiveness of the KB/bot relationship.

Building “bot-ready” knowledge bases: Our experimental initiative – UPDATED (September 1, 2020)

Project: Linguistic analysis and text annotation (2018-2020)

Organization

Moveworks, AI start-up located in Mountain View, California. Came out of stealth April 2019. On November 14, 2019, announced $75 million Series B financing round. Customers include Autodesk, Broadcom, Freedom Financial, Medallia, Nutanix, and Rambus.

Product

Cloud-based AI platform, purpose-built for large enterprises, that resolves employees’ IT support issues instantly and automatically.

Role

Data specialist

Tasks

Annotation of KB articles and user intents for machine learning and natural language processing. Linguistic and user experience (UX) analytics. Named entity recognition.

Project: Museum record and image cataloging (2014-2020)

 

For the Computer History Museum in Mountain View, California, researched and cataloged computer artifacts, archival documents, images, and software. Our current record-count is over 5000.

Project: Educational flashcards (2015-ongoing)

Created flashcard sets for Spanish, German, and Dutch (A1-B2 levels), all publicly available on the Brainscape platform.

Subject matter includes key words and phrases (English / foreign language) culture, geography, and history.

Statistics

  • 18
  • 361
  • 1,072
  • 3,698
  • 56,972
  • 2,648

Linguistic analysis, text annotation for machine learning, writing, editing

Recent Posts

  • Viz of the day: News from Nan posts per year
    February 17, 2021Dick
  • Building “bot-ready” knowledge bases: Our experimental initiative – UPDATED
    September 1, 2020Anna
  • Building bot-ready knowledge bases #12: Summary Presentation
    April 26, 2020Anna
  • Building bot-ready knowledge bases #11: Integration with the Telegram instant messaging service
    March 27, 2020Dick
  • Building bot-ready knowledge bases #10: Using DITA/XML metadata as a bot training kit
    March 23, 2020Anna
  • Building bot-ready knowledge bases #9: Adding a webhook to our GROCERYbot project
    March 8, 2020Dick
  • Building bot-ready knowledge bases #8: Lessons learned from our GROCERYbot project
    February 24, 2020Dick
  • Building bot-ready knowledge bases #7: Creating a GROCERYbot web demo
    February 17, 2020Anna
  • Building bot-ready knowledge bases #6: GROCERYbot training and validation
    February 10, 2020Dick
  • Building bot-ready knowledge bases #5: Adding knowledge connectors to GROCERYbot
    February 3, 2020Anna
  • Building bot-ready knowledge bases #4: The Dialogflow console and adding entities and intents
    January 24, 2020Anna
  • Building bot-ready knowledge bases #3: Objectives for the grocery shopping chatbot
    December 26, 2019Anna
  • Building bot-ready knowledge bases #2: The grocery shopping project
    December 2, 2019Anna
  • Building bot-ready knowledge bases #1: Introduction
    September 26, 2019Anna
  • Data science: A comparison of interpreted languages for AI and data science
    September 13, 2019Dick

Pages

Building bot-ready knowledge bases

My hi-tech adventure… continued

Communication, or MIScommunication?

Computer history

My hi-tech adventure… original

To contact Anna or Dick

avanraaphorst@gmail.com

rjohnson42@gmail.com

Copyright

2019-2021 VR Communications LLC

Our personal website

newsfromnan.com

Our Address

1354 Oak View Cir Apt 228
Rohnert Park, California 94928

Proudly powered by WordPress