This is a continuation of my previous article:
Add Undo/Redo or Back/Forward functionality to your application
In the last article we constructed a class that would help us add Undo/Redo or Back/Forward functionality to our application. We read about the capabilities of the class, but a working example was missing.
In this article we will see how to create a text-only web browser for ourselves. These kinds of browsers are useful in various situations. However I won’t go into details about that at present.
The link to source code as well as executable file is given at the end of this article.
So, first we will create the basics of the web-browser, and then later on we will see how to add the Back/Forward capabilities to it.
Creating the Basic Text-Only WebBrowser Layout
1. Create a new VB.NET Windows project. A form named Form1 will be added by default. Rename it to TextOnlyWebBrowserForm.
2. Add a ToolStrip control, a TextBox and a StatusStrip control to it.
3. Rearrange the controls so that the form finally looks like as in the screenshot below. I hope this doesn’t need much explanation. I’ve just highlighted the important properties to be set.
4. To make the buttons look pretty I added images to them. I downloaded the button images free from www.freedesign4.me (all credits to the creators). You may use different images if you want to.
Adding the Basic Browsing Functions to our TextOnly WebBrowser
OK. So now we are ready to start putting some code and see our browser in action. We will start adding the basic functions to our WebBrowser and later enhance it by implementing the Back/Forward functionality. We will use the System.Net.WebClient class for this demo, to keep things simple.
1. Open the form’s code window and put the following code:
Option Strict On
Imports System.Net
Public Class TextOnlyWebBrowserForm1
Dim WithEvents MyWebClient As New WebClient
Private Sub TextOnlyWebBrowserForm_Load(sender As System.Object, e As System.EventArgs) Handles Me.Load
EnableOrDisableButtons()
StatusLabel.Text = "Ready"
End Sub
Private Sub GoButton_Click(sender As System.Object, e As System.EventArgs) Handles GoButton.Click
Dim url As String = UrlTextBox.Text
If Not Uri.IsWellFormedUriString(url, UriKind.Absolute) AndAlso Not url.Contains("://") Then url = "http://" & url
If Uri.IsWellFormedUriString(url, UriKind.Absolute) Then
Navigate(New Uri(url))
Else
MessageBox.Show("Invalid Address!", "Error", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
End If
End Sub
Private Sub StopButton_Click(sender As System.Object, e As System.EventArgs) Handles StopButton.Click
MyWebClient.CancelAsync()
End Sub
Private Sub UrlTextBox_KeyPress(sender As Object, e As System.Windows.Forms.KeyPressEventArgs) Handles UrlTextBox.KeyPress
If e.KeyChar = Chr(13) Then GoButton.PerformClick() 'enter key
End Sub
Private Sub UrlTextBox_TextChanged(sender As Object, e As System.EventArgs) Handles UrlTextBox.TextChanged
GoButton.Enabled = UrlTextBox.Text.Length > 0
End Sub
Private Sub MyWebClient_DownloadProgressChanged(sender As Object, e As DownloadProgressChangedEventArgs) Handles MyWebClient.DownloadProgressChanged
StatusLabel.Text = e.BytesReceived \ 1024 & "KB of data recieved... "
End Sub
Private Sub MyWebClient_DownloadStringCompleted(sender As Object, e As DownloadStringCompletedEventArgs) Handles MyWebClient.DownloadStringCompleted
If e.Error IsNot Nothing Then
WebDocumentTextBox.Text = e.Error.Message
Else
WebDocumentTextBox.Text = e.Result
WebDocumentTextBox.SelectionLength = 0
End If
WebDocumentTextBox.Select()
EnableOrDisableButtons()
StatusLabel.Text = "Ready"
End Sub
Private Sub Navigate(uri As Uri)
If MyWebClient.IsBusy Then MyWebClient.CancelAsync()
While MyWebClient.IsBusy
Threading.Thread.Sleep(1000)
End While
WebDocumentTextBox.Text = ""
UrlTextBox.Text = uri.AbsoluteUri
MyWebClient.DownloadStringAsync(uri)
StatusLabel.Text = "Looking up " & uri.AbsoluteUri
EnableOrDisableButtons()
End Sub
Private Sub EnableOrDisableButtons()
GoButton.Enabled = UrlTextBox.Text.Length > 0
StopButton.Enabled = MyWebClient.IsBusy
End Sub
End Class
2. Run the code. Type some URL in the UrlTextBox and hit enter key or press the Go button.
3. If it gets the requested web page, we are successful. The WebDocumentTextBox shows raw html at present. This is OK for now. We will update our code later in this session to show plain human-readable text instead of raw html.
Converting Raw HTML to Plain Text
We will now use Regular Expressions to strip off the html tags and show the web page as plain text. Regular Expressions are ideal for such scenarios because they eat up the heavy logic involved for string search/replace operations.
1. Add a new class to your project and name it HtmlToPlainTextConverter.
2. Add the following code:
Option Explicit On
Imports System.Text.RegularExpressions
Public Class HtmlToPlainTextConverter
Public Shared Function Convert(ByVal html As String) As String
'' first remove the unwanted white-spaces
html = Regex.Replace(html, "\s+", " ", RegexOptions.IgnoreCase)
html = Regex.Replace(html, ">\s+<", "><", RegexOptions.IgnoreCase)
'' remove anything inside the <head> tags
html = Regex.Replace(html, "<head\b[^>]*>(.*?)</head>", "", RegexOptions.IgnoreCase)
'' remove all <style> tags
html = Regex.Replace(html, "<style\b[^>]*>(.*?)</style>", "", RegexOptions.IgnoreCase)
'' remove all <script> tags
html = Regex.Replace(html, "<script\b[^>]*>(.*?)</script>", "", RegexOptions.IgnoreCase)
'' double line breaks - <p>, <div>
html = Regex.Replace(html, "<div\b[^>]*>(.*?)</div>", "$1" & vbCr & vbCr, RegexOptions.IgnoreCase)
html = Regex.Replace(html, "<p\b[^>]*>(.*?)</p>", "$1" & vbCr & vbCr, RegexOptions.IgnoreCase)
'' <br> tags
html = Regex.Replace(html, "<br\b[^>]*>", vbCr, RegexOptions.IgnoreCase)
'' table formatting
html = Regex.Replace(html, "<table\b[^>]*>(.*?)</table>", "$1" & vbCr, RegexOptions.IgnoreCase)
html = Regex.Replace(html, "<tr\b[^>]*>(.*?)</tr>", "$1" & vbCr & vbCr, RegexOptions.IgnoreCase)
html = Regex.Replace(html, "<td\b[^>]*>(.*?)</td>", "$1" & vbTab, RegexOptions.IgnoreCase)
'' <ul> and <li>
html = Regex.Replace(html, "<ul\b[^>]*>(.*?)</ul>", "$1" & vbCr, RegexOptions.IgnoreCase)
html = Regex.Replace(html, "<li\b[^>]*>(.*?)</li>", " * $1" & vbCr, RegexOptions.IgnoreCase)
'' any other tag
html = Regex.Replace(html, "<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>", " $2 ", RegexOptions.IgnoreCase)
'' finally anything that looks like a tag <...>
html = Regex.Replace(html, "<.+?>", " ", RegexOptions.IgnoreCase)
'' reduce too much of unwanted line breaks
html = Regex.Replace(html, "\r\r+", vbCr & vbCr, RegexOptions.IgnoreCase)
'' replace frequently used html special characters
Dim htmlChars() As String = Split("Á á Â â ´ Æ æ À à & Å å Ã ã Ä ä ¦ Ç ç ¸ ¢ © ¤ ° ÷ É é Ê ê È è Ð ð Ë ë ½ ¼ ¾ > Í í Î î ¡ Ì ì ¿ Ï ï « < ¯ µ · ¬ Ñ ñ Ó ó Ô ô Ò ò ª º Ø ø Õ õ Ö ö ¶ ± £ " » ® § ­ ¹ ² ³ ß Þ þ × Ú ú Û û Ù ù ¨ Ü ü Ý ý ¥ ÿ")
Dim plainChars() As String = Split("Á á Â â ´ Æ æ À à & Å å Ã ã Ä ä ¦ Ç ç ¸ ¢ © ¤ ° ÷ É é Ê ê È è Ð ð Ë ë ½ ¼ ¾ > Í í Î î ¡ Ì ì ¿ Ï ï « < ¯ µ • ¬ Ñ ñ Ó ó Ô ô Ò ò ª º Ø ø Õ õ Ö ö ¶ ± £ "" » ® § ¬ ¹ ² ³ ß Þ þ × Ú ú Û û Ù ù ¨ Ü ü Ý ý ¥ ÿ")
For i = 0 To htmlChars.Length - 1
html = html.Replace(htmlChars(i), plainChars(i))
Next i
'' VbCr with vbCrlf
html = html.Replace(vbCr, vbCrLf)
Return html
End Function
End Class
3. Next, we modify our code to use this class instead of outputting raw html. It’s just a one line change in our MyWebClient DownloadStringCompleted event handler.
Private Sub MyWebClient_DownloadStringCompleted(sender As Object, e As DownloadStringCompletedEventArgs) Handles MyWebClient.DownloadStringCompleted
If e.Error IsNot Nothing Then
WebDocumentTextBox.Text = e.Error.Message
Else
WebDocumentTextBox.Text = HtmlToPlainTextConverter.Convert(e.Result)
End If
WebDocumentTextBox.Select()
EnableOrDisableButtons()
StatusLabel.Text = "Ready"
End Sub
Run the code. See what happens. It should now show plain text instead of raw html.
Adding Back/Forward Functionality to Our WebBrowser
So we are complete with our basic WebBrowser. It is capable of navigating to URLs we tell it to. Now let’s see what it would take to add the Back/Forward/Refresh functionality to our WebBrowser.
1. We begin by adding our UndoRedoClass to the application. Copy the code from the previous part of the article.
2. Then we add a class to our application that stores the URLs we have navigated. These URLs will be used to go back and forth when the user clicks the Back or Forward button. We create a class called UrlItem that has only one property named Uri.
Class UrlItem
Private _Uri As Uri
Public Property Uri() As Uri
Get
Return _Uri
End Get
Set(ByVal value As Uri)
_Uri = value
End Set
End Property
Public Sub New(url As String)
Me.Uri = New Uri(url)
End Sub
End Class
3. Next, we modify our form code to add support for our UndoRedoClass object. Add the following code:
'' This declaration goes in the form declaration section
Dim WithEvents UndoRedoHandler As New UndoRedoClass(Of UrlItem)
Private Sub BackButton_Click(sender As System.Object, e As System.EventArgs) Handles BackButton.Click
UndoRedoHandler.Undo()
End Sub
Private Sub ForwardButton_Click(sender As System.Object, e As System.EventArgs) Handles ForwardButton.Click
UndoRedoHandler.Redo()
End Sub
Private Sub RefreshButton_Click(sender As System.Object, e As System.EventArgs) Handles RefreshButton.Click
Navigate(UndoRedoHandler.CurrentItem.Uri)
End Sub
Private Sub UndoRedoHandler_RedoHappened(sender As Object, e As UndoRedoEventArgs) Handles UndoRedoHandler.RedoHappened
Dim item As UrlItem = CType(e.CurrentItem, UrlItem)
Navigate(item.Uri)
End Sub
Private Sub UndoRedoHandler_UndoHappened(sender As Object, e As UndoRedoEventArgs) Handles UndoRedoHandler.UndoHappened
Dim item As UrlItem = CType(e.CurrentItem, UrlItem)
Navigate(item.Uri)
End Sub
We also modify the following existing procedures to fit our UndoRedoHandler object:
Private Sub GoButton_Click(sender As System.Object, e As System.EventArgs) Handles GoButton.Click
Dim url As String = UrlTextBox.Text
If Not Uri.IsWellFormedUriString(url, UriKind.Absolute) AndAlso Not url.Contains("://") Then url = "http://" & url
If Uri.IsWellFormedUriString(url, UriKind.Absolute) Then
UndoRedoHandler.AddItem(New UrlItem(url))
Navigate(UndoRedoHandler.CurrentItem.Uri)
Else
MessageBox.Show("Invalid Address!", "Error", MessageBoxButtons.OK, MessageBoxIcon.Exclamation)
End If
End Sub
Private Sub EnableOrDisableButtons()
BackButton.Enabled = UndoRedoHandler.CanUndo
ForwardButton.Enabled = UndoRedoHandler.CanRedo
RefreshButton.Enabled = UndoRedoHandler.CurrentItem IsNot Nothing
GoButton.Enabled = UrlTextBox.Text.Length > 0
StopButton.Enabled = MyWebClient.IsBusy
End Sub
Note: We named our class object as UndoRedoHandler which might give an illusion that it does some undo/redo operation. I did this just to keep the name meaningful according to our class & class members and avoid confusion. In reality it performs a back/forward function. Read undo synonymous to back and redo to forward.
By now you might be wondering why we took the extra trouble of creating the new class (of UrlItem) when we could have simply declared our UndoRedoHandler as New UndoRedoClass(Of Uri) or as New UndoRedoClass(Of String). Of course we could have done that, and that would have worked fine too. But there’s always a reason why things are done the way they are done. Don’t worry you’ll come to know everything as we progress further. For now, just run the code and see if the Back/Forward and Refresh buttons have started functioning as desired.
Adding More Features with the Help of Our UndoRedoClass
So now we have achieved our task of creating our TextOnly WebBrowser. It navigates properly to the URL we type in the address box; and it goes back and forward when clicking the Back/Forward buttons. So far everything looks good! Now let’s think of optimizing our WebBrowser.
Notice that when we move Back/Forward, a fresh web page request is sent each time. We have already browsed that page a few minutes ago. So we should not need to get a fresh page from the internet each time we move Back. So let’s see what it would take to implement document caching into our application.
1. Add a property to the UrlItem class named CachedDocument. In this property we will store our cached document.
So after the change our class will look like this:
Class UrlItem
Private _Uri As Uri
Public Property Uri() As Uri
Get
Return _Uri
End Get
Set(ByVal value As Uri)
_Uri = value
End Set
End Property
Private _CachedDocument As String
Public Property CachedDocument() As String
Get
Return _CachedDocument
End Get
Set(ByVal value As String)
_CachedDocument = value
End Set
End Property
Public Sub New(url As String)
Me.Uri = New Uri(url)
End Sub
End Class
2. We modify our MyWebClient DownloadStringCompleted event handler to fill this cache whenever the document is available. We avoid saving the document when navigation is cancelled, since we might have incomplete document.
Private Sub MyWebClient_DownloadStringCompleted(sender As Object, e As DownloadStringCompletedEventArgs) Handles MyWebClient.DownloadStringCompleted
If e.Error IsNot Nothing Then
WebDocumentTextBox.Text = e.Error.Message
Else
WebDocumentTextBox.Text = HtmlToPlainTextConverter.Convert(e.Result)
If Not e.Cancelled Then
UndoRedoHandler.CurrentItem.CachedDocument = WebDocumentTextBox.Text
End If
End If
WebDocumentTextBox.Select()
EnableOrDisableButtons()
StatusLabel.Text = "Ready"
End Sub
3. Next, we change our Navigate method to use this cached document. We add an optional parameter to it so that we can force loading a fresh document when we want to.
Private Sub Navigate(uri As Uri, Optional allowCache As Boolean = False)
If MyWebClient.IsBusy Then MyWebClient.CancelAsync()
While MyWebClient.IsBusy
Threading.Thread.Sleep(1000)
End While
WebDocumentTextBox.Text = ""
UrlTextBox.Text = uri.AbsoluteUri
If allowCache AndAlso UndoRedoHandler.CurrentItem.CachedDocument <> "" Then
WebDocumentTextBox.Text = UndoRedoHandler.CurrentItem.CachedDocument
StatusLabel.Text = "Ready"
Else
MyWebClient.DownloadStringAsync(uri)
StatusLabel.Text = "Looking up " & uri.AbsoluteUri
End If
EnableOrDisableButtons()
End Sub
4. Finally we start using this new Navigate method, to support caching wherever possible. – i.e. on Back and Forward buttons. All we need to do is pass True for this optional parameter. If we pass False, or don’t pass anything it will continue to behave the way it was behaving till now.
Private Sub UndoRedoHandler_RedoHappened(sender As Object, e As UndoRedoEventArgs) Handles UndoRedoHandler.RedoHappened
Dim item As UrlItem = CType(e.CurrentItem, UrlItem)
Navigate(item.Uri, True)
End Sub
Private Sub UndoRedoHandler_UndoHappened(sender As Object, e As UndoRedoEventArgs) Handles UndoRedoHandler.UndoHappened
Dim item As UrlItem = CType(e.CurrentItem, UrlItem)
Navigate(item.Uri, True)
End Sub
Actually we don’t need two separate procedures since both have the same body and function definition. We could combine them into one procedure and attach both the event handlers to it. Do that if you want to.
5. Finally our application is ready with cache support. Run the application and see how it goes.
Click the Back/Forward/Refresh buttons to see how they go. Back/Forward buttons should load the document immediately, while the Refresh button gets a fresh copy of the document from the website. So far everything looks excellent! But now think logically. There’s a big pitfall to this approach. Documents can be large, we don’t have any control over its size. So if we keep a lot many of these documents in memory, we will consume a lot of memory unnecessarily and start seeing performance problems with our application on long hours usage.
So what do we do now?
We will use the same technique used by other WebBrowsers. We will save to disk file and load it whenever we need it. This way the documents will not be loaded in memory forever. So now we will modify the CachedDocument property to store the FileName instead of the file contents. And we write the file to disk.
Private _CachedDocument As String
Public Property CachedDocument() As String
Get
If IO.File.Exists(_CachedDocument) Then
Return IO.File.ReadAllText(_CachedDocument)
Else
Return ""
End If
End Get
Set(ByVal value As String)
Dim fileName As String = String.Format("~{0:yyMMddhhmmssffff}.tmp", Now)
fileName = IO.Path.Combine(Application.UserAppDataPath, "Cache", fileName)
IO.File.WriteAllText(fileName, value)
_CachedDocument = fileName
End Set
End Property
That’s all we need to do for this. It will now use disk file instead of memory for storing document.
This is a very simple cache technique where we never reuse old files. As such, old cached files are a waste for us. So we clear our cache directory on our application startup, to avoid unnecessary files in the cache.
Private Sub TextOnlyWebBrowserForm_Load(sender As System.Object, e As System.EventArgs) Handles Me.Load
EnableOrDisableButtons()
StatusLabel.Text = "Ready"
'' Clean old cached files
Dim cacheDir As String = IO.Path.Combine(Application.UserAppDataPath, "Cache")
If Not IO.Directory.Exists(cacheDir) Then IO.Directory.CreateDirectory(cacheDir)
Dim cutOffDate As Date = Now.AddDays(-1)
Try
For Each file In IO.Directory.GetFiles(cacheDir, "~*.tmp")
If IO.File.GetLastAccessTime(file) < cutOffDate Then IO.File.Delete(file)
Next
Catch ex As Exception
End Try
End Sub
If you notice the code above, you will find that I’m not deleting all files from the cache. This is because there is a possibility that other instances of our application might be using some of those files. A gap of one day seems like a safe period. Even if we delete some files that are in use by other instances of our application, the maximum loss we have is that that instance will have to bring fresh page from the website when it doesn’t find file in the cache. So it is harmless anyways.
We now have a WebBrowser that has Back/Forward/Refresh functions and supports cache. And that brings us to the end of this tutorial. I hope you enjoyed reading it as much as I enjoyed bringing it to you.
Enjoy!
Source Code and Executable File
Link to source code and executable file is given below.
- Source Code: TextOnlyWebBrowser.zip
- Executable File: TextOnlyWebBrowser.exe

April 17, 2012 at 12:23 am
[…] UndoRedoClass Example – This application demonstrates the use of our UndoRedoClass by creating a TextOnly WebBrowser and implementing smooth flow of back/forward navigation along with some advanced features. Posted in Programming. » […]
April 28, 2012 at 11:23 am
Hey! I’ve to offer a large thumbs up with the good info you’ve below with this post. I most certainly will be coming again for a blog for soon.
April 28, 2012 at 5:50 pm
Excellent goods from you, man. I’ve understand your stuff previous to and you are just extremely excellent. I really like what you have acquired here, really like what you’re stating and the way in which you say it. You make it enjoyable and you still take care of to keep it sensible. I can not wait to read much more from you. This is actually a tremendous website.
November 12, 2013 at 9:40 pm
Quality articles or reviews is the secret to invite the people to go to see the site, that’s what this
web page is providing.