IBM Support

IBM Social Media Analytics FAQ

Technote (FAQ)


Is there a Frequently Asked Question (FAQ) list for IBM Social Media Analytics?


This FAQ relates to IBM Social Media Analytics version 1.2 and 1.3. References to "model" signifies Themes/Concepts/Hotwords in SMA 1.2 as opposed to Themes/Concepts in SMA 1.3.

SMA 1.3

Monitoring Usage


Reports and Analysis

Integration Options

Additional information

Tech notes and customer documentation provide a wealth of information. To find the latest go to and type 'social media analytics' in the Search Support and Downloads box in the right-hand column and make sure 'within my selected products' is selected.

SMA 1.3

1) What's new in SMA 1.3?

    New simplified Configuration UI. You can now configure an analysis from a single tab. You can use include, context, and exclude terms in concepts and themes to retrieve data, instead of having to define queries separately.

    Greater flexibility in categorizing and filtering results. To simplify configuration in IBM Social Media Analytics, and to give you greater flexibility, you can analyze relationships between any theme in Reporting, not just between types and the list of hotwords.

    All themes considered for evolving topics. In SMA 1.2, you selected the Consider for evolving topics per type (now named theme). Now you can select to run the evolving topics phase at project level. All themes in a project are considered.

    Choose to include or exclude themes from data fetching. You can include or exclude themes from the data fetch process to ensure that only relevant documents are fetched by enabling or disabling the Fetch Documents option. Excluded themes are still considered in analysis

    Affinity analysis in reporting. In IBM Social Media Analytics, the reporting environment offers an Affinity measure for many reports. The Affinity measure analyzes how closely two dimensions (or attributes of a dimension) are related to each other. This helps you to gain insight about possible strengths, weaknesses, opportunities, or threat areas based on the affinity between the dimensions or attributes.

    Improved snippet view. In Reporting, an improved snippet view shows more detail about the snippets that are associated with bar charts. To access the snippet view, right-click a bar in a
    chart, and click Go To > Snippet View. In the snippet view, concepts and sentiment types are underlined according to the colors in the legend; relationships have a dashed underline.

    Sentiment in Portuguese and Russian languages

    IBM Cognos Mobile is now enabled. Mobile reports can be build and deployed through the IBM Cognos Mobile Application.

2) Any changes to terminology?
  • Types are now named themes.
  • Ah-hoc jobs are now named snapshots in time.
  • Scheduled jobs are now named ongoing.
  • Hotwords have been removed.

3) What happens to a model migrated from SMA 1.2?
  • Hotwords are converted into concepts, and are contained in a theme that is named "Hotwords" on the Configuration screen. In the Analysis UI, the "Hotwords" option applies to the theme selected in the "Hotwords theme for Analysis application" option part of the Analysis Options. In Reporting, one can use any Theme for Relationship filtering.
  • Queries defined manually in SMA 1.2 as ported as "Global Queries" and part of the "Sources and Media Sets" Analysis Options. For simplicity and manageability, it is recommended to make use of the Theme "Fetch Documents" option which will make use of the Concepts definitions to pull data rather than continuing with Global Queries.

4) What are the most important changes to consider when building a model in the new Configuration UI?
  • Minimize the use of Global or Concept Queries and let the application optimize your data fetch.
  • Decide on which Themes to be used to "Fetch Documents"
  • Make extensively use of include, context, and exclude terms when defining concepts. Such will ensure your analysis is precise while reducing your data cost.
  • Refine your model without Twitter to reduce cost related to GNIP PowerTrack. When the model is hardened, make use of concept based previews for further refinement.

5) Can the new Configuration UI "Preview" show Twitter? Can the "Show Estimates" calculate Twitter?

    By default, Twitter documents are not previewed or counted for concepts as to prevent unnecessary data retrieval costs. To enable Twitter, un select all other data sources in the Analysis Options other than Twitter and redo the operations. Note that such an operation will trigger collection of Tweets at GNIP PowerTrack.

6) How do you calculate affinity?

    Affinity analyzes how closely two dimensions (or attributes of a dimension) are related to each other. This helps you to gain insight about possible strengths, weaknesses, opportunities, or threat areas based on the affinity between the dimensions or attributes.

    The Affinities measure is based on a statistical method that is known as chi-square test of independence. This method estimates how often two dimensions should occur together if they were independent (for example, products and product features). It compares the estimate with the actual count, and identifies whether the difference is statistically significant (either higher or lower than expected).

    Affinity reports in the Reporting UI have been rethought from a statistical perspective and will therefore yield slightly different results than those in Analysis UI. Generally it can be said that Reporting UI will display fewer affinities with the discovered ones being more likely to be of more interest.

7) The Configuration UI Preview seems to ignore language setting?

    When a language is selected in the Analysis Options and Preview is used at the concept level, you might find documents not matching the language filter. This is often the case for Twitter as it is difficult to determine the language due to the restricted document length (140 characters maximum). Document-level language detection is provided by data providers and not always possible in this case.

8) What browsers do you support?
9) Any changes to the data offering since SMA 1.2?

    Retention for Twitter data has increased from 3 months to 13 months.

10) How are snippets generated?

    Snippets are created at theme level, and consist of a small text segment that surrounds a concept match in a document. A snippet consists of the sentence that contains the concept and the sentences that surround the sentence that contains the concept. You can have more than one concept match in a snippet.

11) What is the use of roles?

    A role is a label for the theme and can be selected as a filter in Reporting. Roles help users to select the right themes in Reporting, and can improve the quality of the automatic sentiment detection. You can select the following roles: brands, products, features, spokespeople, events, and campaign messages.

12) What happens when a user uses double quotes in the include/context terms?

    SMA will attempt to match these quotes in the document. Most of the time, this is not the intended behavior. To detect documents that talk about 'Social media Analytics', use
      Social Media Analytics

    as your include term. SMA will automatically generate the optimized query "Social Media Analytics" for datafetching for this concept.

13) Can twitter be sampled rather than getting the full hose?

    Yes. When you purchase IBM Social Media Analytics SaaS, you can choose which sampling activation option you would like to use to access Gnip’s Twitter data. For example: 25% Twitter sample, 50% Twitter sample, or full Twitter firehose. A lower sample should translate to lower Gnip costs.

    Accessing a sample of Gnip’s Twitter data vs. accessing Gnip’s full firehose depends on your use case and budget. If a statistically random sample provides sufficient and meaningful analysis for you, consider the sampling option

14) I do see not sentiment terms for Chinese?

    In the Configuration UI Analysis Options, make use of Chinese characters when filtering out for Chinese sentiment terms. For example "好" would display positive terms starting with this character.

15) How does character case affect the SMA analysis?

    Theme and Concepts Definitions: These definitions are matched regardless of case: the term IBM will match ibm, IBM; and iBM.

    Sentiments: Sentiment terms are matched case sensitive, exactly as they are defined in the respective sentiment dictionary. This way, "grand" will not match "Grand Canyon". One exception is the start of a sentence: in these circumstances, SMA will match regardless of case, to capture headlines or sentences like "Great product!"

    Queries: Case insensitive

16) Evolving Topics Top 10 lists - Analysis UI vs. Reporting UI

    SMA 1.3 has changed how topics are showed in the Reporting UI to improve the relevance of the display.

    One snippet can be assigned to multiple topics. The overlap of a certain topic and the snippet is the "topic weight" of the snippet. All topic weights of a snippet sum up to 1. Very often, you have one well-matching topic (weight > 0.7), and several "borderline" ones.

    In the Analsis UI, all topics and all topic weights for a snippet count towards the appropriate topic rivers. This is why rivers are showed "by weight" where as in the Reporting, the topic with the highest weight is used. The Reporting UI will also show the number of snippets that fall into this "top topic" for the given selection.

Monitoring Usage

1) What does the Current Usage Bar Chart at top right measure?

    The Usage Bar indicates the number of documents the customer has retrieved in the current month using IBM Social Media Analytics. IBM Social Media Analytics is licensed based on the number of documents retrieved per month. The usage bar increases when Analysis is run. The usage bar will reset at the beginning of every month. Usage is not affected while previewing documents prior to the analysis.

    A "document" is defined as one blog, discussion, forum posting, message board entry, microblog (Twitter), news item, or a Facebook posting.

2) Can this be reset in some way?

    Yes. The administrator can change through a command line interface. For SaaS customers, please contact if there is a specific requirement related to usage.

3) Does it look at usage consumption for just one project (if I log in as Project admin) or all projects across the server?

    All projects. The usage is calculated per deployment as per license.

4) Are there any limits to the number of Projects on a server?

    For a SaaS deployment, there is a project limit of 22. For a Perpetual Software License (On Premise deployment), the project limit is dependent on the environment setup. More projects could be added if more memory is provisioned on the data node.

5) If a running analysis fails due to any circumstance, and then re-launched, does this then go against the monthly document limit?

    No, the job only counts the documents upon a completion of the analysis. Any failure will reset and re-fetch documents which then in turn do not go against the document limit.

6) Does it go down if I delete a project?


7) Why is the Twitter count from GNIP PowerTrack different than the IBM SMA document count?

    There is not always a 1:1 exact match between tweets collected by GNIP PowerTrack and tweets counted by IBM SMA. The reasons are as follows:

    1. The user can "test/preview" a query that will trigger collection for 24 hours at GNIP PowerTrack. Such queries might never be used in any analysis. In this scenario, GNIP PowerTrack is incremented but not IBM SMA.
    2. The ability to define complex models in the Configuration UI cannot be matched by the query limitations of the data providers. For instance, when numerous "exclude" terms are used by a theme with "Data Fetch" enabled, more documents will be collected by GNIP Powertrack to then be discarded during analysis when "excludes" are considered.
    3. There is a translation between the SMA query to the GNIP PowerTrack rule. This can cause slight to large differences. See this technical note for more information. NOTE: The “language” parameter is not passed from IBM SMA to GNIP PowerTrack prior to SMA 1.3 IF5. In this case, older versions will collect more documents from GNIP PT.
    4. A user can use the same query across projects. In this situation, the SMA count is increased but not GNIP PT.

1) How is sentiment analyzed in IBM Social Media Analytics?

    When a snippet is created, the text related to your model is analyzed using IBM’s proprietary sentiment algorithms. IBM’s sentiment algorithms are based on over a decade of IBM research in this area and are based on Natural Language Processing.

2) What is Evolving Topics?

    Evolving Topics is a unique algorithm that will analyze social media content to discover threads of conversation emerging in social media. This is different than a general word cloud where the focus is on overall re-occurrence in social media. When you are going to analyze social media content, you specify the time frame you want to assess evolving topics. For example, discover topics that are emerging in social media over the last 30 days.

3) In SMA 1.2 what is the impact of clicking the "Consider for evolving topics" for types/concepts/hotwords?

    You can click "consider for evolving topics" for types and hotwords (not for concepts).For types, if "consider for evolving topics" is selected, all snippets from those documents that have the selected types will be used to analyzing and identifying topics. You need to have at least one type selected to have evolving topics results. For hotwords if "consider for evolving topics" is selected, all snippets from documents which have one of those hotwords in one of their snippets will be used to analyze and identify evolving topics. If no hotwords are selected to be considered for evolving topics, only documents with "no hotword" in at least one of their snippets will would be selected.

    In SMA 1.3 concepts are always considered for Evolving Topics. The user however can choose to not run Evolving Topics by specifying accordingly in the Analysis Options.

4) Why is Evolving Topics analyzed using documents and not snippets.

    The Evolving Topic model is built on full documents, not snippets. Why? Because we found that this gives "better" results, yielding topics that can show "previously unknown" content. However, this may mean that some topics contain keywords that are not found in any of the existing snippets (just in the full documents).

5) Is query size limited on number of characters?

    In SMA 1.2 the query is limited to 200 characters. Larger queries can be broken into multiple queries to improve manageability. Managing long queries is not recommended.

    In SMA 1.3 and the new Configuration UI, there is no more need to enter queries. In this case, there are no longer limitations.

6) Is there any way to do a global pull of all documents for a particular web site? For instance, pull everything for a certain twitter handles or Facebook pages, even if it does not match search criteria.

    No. However, media sets can be defined so the analysis can be broken down for one or a specific set of sites.

7) Can you change any of the analysis options or model (Themes and Concepts) during analysis (ongoing or one-time) without restarting the analysis all over again?

    The analysis (ongoing or one-time) must be restarted for any changes to the analysis options or model. This ensures consistency of the analytics between analysis runs.

8) Regarding Media Sets: What are the URL matching rules that apply?

    The product attempts to match the prefix of the URL. For example, by entering, the product will be considering and There is no need to enter “http://”

Analysis & Reporting UI

1) How are geography analytics determined in SMA?

    Social Media Analytics analyzes the author profile text to determine the permanent location of an author. Location is provided on three levels (when available): Country, State, City. Geography is extracted using text analysis rules and dictionaries with > 35.000 locations across the globe. The language and site are taken into account, for instance, a German post from “Stuttgart“ is not from Stuttgart, Arkansas. Finally we always show the number of “unknowns“ as well, to help interpret the findings. Locations included:
      • all cities with more than 100,000 inhabitants in DE, ES, FR, UK, RU
      • all cities with more than 500,000 inhabitants in Europe, China, South America
      • both the native name of the city (e.g. Москва́), as well as its English translation (e.g., Moscow)
      • state information for above cities, where available
2) How are demographic analytics determined in SMA?

    Social Media Analytics analyzes the text to determine gender, marital status (is author married?), and parental status (is author a parent?). To identify gender, we analyze author first name, author nickname, and author content. To identify if the author is married or a parent (for English, German, French, and Spanish content) analysis is performed using trigger terms and text analysis rules. Nicknames can be a good source of information as well (“SuperMom2012“)

3) How are behavior analytics determined in SMA?

    Social Media Analytics identified behavior patterns based on text analysis rules. For example: Users are authors who mention they “have” or “use” a certain product or service. Recommenders are provided for authors mentioning “you should use X“. Detractors are for authors mentioning “stay away from X“. Finally Prospective users are authors who mention they “want”, “would like” “can’t wait to buy” a product or service.

4) How is influencer analysis determined?

    Influencer analysis is initially provided based on Klout scores. Klout scores are provided for Twitter authors only, with the purchase of the Gnip PowerTrack service. If other scores are requested, a customer can integrate up to 10 different scores through simple configuration changes (supported for On Premise customers only). The customer is responsible for the contract with the various score providers.

5) Can we get direct Snippet view on Demographic reports or do we need to go through Author view first?

    At this point, there is no way to go directly to a snippet view.

6) Some of the reports in Share of Voice and Reach are similar? How are these terms defined and what is the fundamental difference?

    Share of Voice is focused on analyzing different dimensions of snippet volume such as a breakdown of snippet volume by sentiment. Reach is focused on analyzing different dimensions related to the sources of social media content such as a geographic breakdown by source. For example you can analyze how many snippets came from blogs and authors located in the US. The best approach way to get the report description is to make use of the “?” link on each report.

7) Will a SaaS customer be able to do custom reporting? Will this be enabled?

    A SaaS customer can take full advantage of Cognos report authoring capabilities and create customer reports on the existing model. If changes are required to the underlying model (for example, add new data to the analysis), this is also possible with the correct Cognos license. Please see this tech note for integration options:

8) Snippet Export is supported in SaaS, correct? Is there a way to export relational data and/or Snippet data to another tool? Technical or license restrictions for this?

    Snippet Export is supported for SaaS. However, some fields in Twitter will be “Restricted” during the export. There is also a limit on the number of tweets that can be exported.

9) For many reports, the results are sorted alphabetically. Can we have some sort controls and top X selecting/sorting?

    The layout of the reports in Reporting is static. The layout could be changed by creating a new Dashboard. Training videos provides insights on how to create dashboards:

10) Can we control the window for when evolving topics get evaluated?

    Yes. The window is configured when the job is configured (SMA 1.2) or the Analytics Options (SMA 1.3).

Integration Options

1) What are the integration options for Social Media Analytics with other analytics products or business processes?

2) Is there any option to integrate with our corporate LDAP?

Document information

More support for: Social Media Analytics

Software version: 1.2, 1.3

Operating system(s): Linux

Reference #: 1638312

Modified date: 13 March 2015

Translate this page: