InfoConnect for Unisys
Get Text from a Web Page

This sample program shows how to create a InfoConnect Web document and get text from a Web page. To do this, we get a Web control object to access the Web page document and then use XPath to identify the Web element (text) on the page, as follows:

IWebElement WebElement = WebDocument.GetElement("HTML/BODY/TABLE/TBODY/TR[1]/TD[1]")

This data is displayed on the console.

The Html markup for the our basic demo page we are connecting to is shown below:

The table.htm file
Copy Code
!DOCTYPE html
<html>
<head>
  <title>HTML Test Page</title>
</head>
<body>
  <table border="1">
  <tbody>
    <tr id="first-table-tr">
      <td id="USDOLLAR">U.S. Dollars</td>
      <td>1</td>
      <td>1.3240</td>
      <td>108.25</td>
      <td>0.8114</td>
    </tr>
    <tr>
      <td>Canadian Dollars</td>
      <!--Use XPath to get the data on the following line-->
      <td>0.7553</td>
      <td>1</td>
      <td>81.7598</td>
      <td>0.6128</td>
    </tr>
    <tr>
        
.................

To get text from a Web page

  1. In Visual Studio, create a new Console Application project and add references for the following InfoConnect assemblies. (Depending on your version of Visual Studio, these can be found either on the .NET tab or under Assemblies | Extensions.)
    Attachmate.Reflection
    Attachmate.Reflection.Framework
  2. Add a reference to the System.Windows.Forms assembly.
  3. In InfoConnect, create a Web page document and set the URL as
    http://docs.attachmate.com/reflection/prog-demo/table.html.
    
  4. Replace all the code in the Program.cs file with the following code:
       
    Get text from a Web page
    Copy Code
    using System;
    using System.Collections.Generic;
    using System.Text;
    using Attachmate.Reflection.Web.Msie;
    using Attachmate.Reflection.Framework;
    using Attachmate.Reflection.UserInterface;
    using System.Windows.Forms;
    using Application = Attachmate.Reflection.Framework.Application;
    
    namespace GetTextFromWebPage
    {
        class Program
        {
    
            static void Main(string[] args)
            {
                //Start a visible instance of InfoConnect or get the instance running at the given channel name
                Application app = MyReflection.CreateApplication("workspace", true);
    
                //open a Web Session document set to the "docs.attachmate.com/reflection/prog-demo/table.html" URL
                IWebControl webControl = (IWebControl)app.CreateControl(Environment.GetEnvironmentVariable("USERPROFILE") + @"\Documents\Micro Focus\InfoConnect\GetDataFromWeb.urlx");
    
                //Get the Web document and create a View to make the Web document visible 
                IWebDocument webDocument = webControl.Document;   
                IFrame frame = (IFrame)app.GetObject("Frame");
                frame.CreateView(webControl);
    
                //Wait until the document is ready
                while(webControl.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
                {
                    System.Threading.Thread.Sleep(5000);
                }
    
                //Use XPATH to get the WebElement in second row and second column of the table.
                IWebElement webElement = webDocument.GetElement("HTML/BODY/TABLE/TBODY/TR[1]/TD[1]");
    
                //Get the text and write it to the console
                string conversionRate = webElement.InnerText;
                Console.WriteLine("The currency conversion rate from Canadian to U.S. dollars: " + conversionRate);
    
                Console.ReadKey();
            }
    
        }
    }
    
    

To test this project

  1. Press F5 to run the project.
  2. Verify that the currency conversion rate from Canadian to US Dollars is displayed on the Console.