<?xml version="1.0" encoding="iso-8859-1"?>
<feed version="0.3" xmlns="http://purl.org/atom/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xml:lang="en">
  <title>Complete Software Engineering</title>
  <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/" />
  <modified>2007-12-03T16:35:01Z</modified>
  <tagline>by Rob Grzywinski</tagline>
  <id>tag:www.realityinteractive.com,2008:/rgrzywinski/1</id>
  <generator url="http://www.movabletype.org/" version="3.121">Movable Type</generator>
  <copyright>Copyright (c) 2007, rgrzywinski</copyright>
  <entry>
    <title>String Edit Distance</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000249.html" />
    <modified>2007-12-03T16:35:01Z</modified>
    <issued>2007-12-03T08:39:43-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.249</id>
    <created>2007-12-03T14:39:43Z</created>
    <summary type="text/plain">One commonly used approach for measuring similarity between strings is the string edit distance. The basic premise of the string edit distance is to find the minimum number of character edit operations (copy, replace, add, and remove) needed to transform...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>One commonly used approach for measuring similarity between strings is the <a href="http://en.wikipedia.org/wiki/Edit_distance">string edit distance</a>.  The basic premise of the string edit distance is to find the minimum number of character edit operations (copy, replace, add, and remove) needed to transform one string into the other.  For example, the edit distance between <i>parks</i> and <i>spark</i> is two:  one edit is needed to add a leading <i>s</i> to <i>parks</i> and a second edit is needed to delete the trailing <i>s</i>.  Not suprisingly, string edit distance is used in spell-checking applications.</p>

<p>It is convenient to look at string edit distance as a string alignment problem.  The example below should provide some insight into this:</p>

<pre>
&nbsp;String 1:  &nbsp;&nbsp;&nbsp;p&nbsp;a&nbsp;r&nbsp;k&nbsp;s&nbsp;
&nbsp;String 2:  &nbsp;s&nbsp;p&nbsp;a&nbsp;r&nbsp;k&nbsp;&nbsp;&nbsp;
&nbsp;Operation: &nbsp;i&nbsp;c&nbsp;c&nbsp;c&nbsp;c&nbsp;d&nbsp;
</pre>

<p>The <i>operation</i> is the character operation needed to align or transform string 1 into string 2.  <i>i</i> is insert, <i>c</i> is copy, <i>r</i> is replace, and <i>d</i> is delete.  (You may find other references using <i>s</i> (substitute) instead of <i>r</i> (replace).)  The space at the beginning of string 1 or at the end of string 2 is called a gap.</p>

<p>For a given set of strings, there may be many ways to align the strings.  For example, the strings <i>abc123</i> and <i>abcqwertabc123</i> have two obvious alignments:</p>

<pre>
&nbsp;a&nbsp;b&nbsp;c&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1&nbsp;2&nbsp;3
&nbsp;a&nbsp;b&nbsp;c&nbsp;q&nbsp;w&nbsp;e&nbsp;r&nbsp;t&nbsp;a&nbsp;b&nbsp;c&nbsp;1&nbsp;2&nbsp;3
</pre>

<p>and</p>

<pre>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;b&nbsp;c&nbsp;1&nbsp;2&nbsp;3
&nbsp;a&nbsp;b&nbsp;c&nbsp;q&nbsp;w&nbsp;e&nbsp;r&nbsp;t&nbsp;a&nbsp;b&nbsp;c&nbsp;1&nbsp;2&nbsp;3
</pre>

<p>The particular application would determine which alignment is preferred.</p>

<p>Different distance metrics are used to select the desired alignment.  For example, the measurement used in the original example of the cost between <i>parks</i> and <i>spark</i> is the <a href="http://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a>.  In this metric the cost of replacing, adding or deleting a character is one whereas the cost of equal (or copied) characters is zero.  (In the conversion of <i>parks</i> to <i>spark</i>, the cost of copying letters <i>p a r k</i> is zero.)  The second alignment above (where <i>abc123</i> has no gap) can be obtained using the <a href="http://en.wikipedia.org/wiki/Smith-Waterman_algorithm">Smith-Waterman distance</a>.</p>

<p>For simple strings it is relatively easy to understand how to transform one string into another but with longer strings or very dissimilar strings it is not so clear.  For example:</p>

<pre>
&nbsp;See Spot run
&nbsp;We work and play
</pre>

<p>There are brute force techniques that consider all possible alignments and find the one(s) with the minimum cost.  I refer you to a few <a href="http://www.cs.umass.edu/~mccallum/papers/crfstredit-uai05.pdf">good</a> <a href="http://www.itu.dk/courses/AVA/E2005/StringEditDistance.pdf">references</a> on string edit distance and the associated dynamic programming algorithms for more information.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>URL Similarity</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000248.html" />
    <modified>2007-12-01T14:26:52Z</modified>
    <issued>2007-11-30T07:13:12-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.248</id>
    <created>2007-11-30T13:13:12Z</created>
    <summary type="text/plain">In order to make template extraction as palatable as possible I am going to start by walking through how one discovers the templates used to generate a set of URLs. I&apos;m going to use Amazon.com as an example. Search for...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>In order to make template extraction as palatable as possible I am going to start by walking through how one discovers the templates used to generate a set of URLs.  I'm going to use <a href="http://www.amazon.com">Amazon.com</a> as an example.</p>

<p>Search for <a href="http://amazon.com/s/ref=nb_ss_gw?url=search-alias%3Daps&field-keywords=book">"book"</a> on Amazon.com.  You should see a page with a number of results / data records.  Placing your mouse over each of the titles shows many similar urls, some of which I have pasted below:</p>

<center>
<table cellpadding="0" cellspacing="0 border="0" style="text-align: left;">
<tr><td>http://www.amazon.com/Harry-Potter-Half-Blood-Prince-Book/dp/0439785960/ref=pd_bbs_1?ie=UTF8&s=books&qid=1196364228&sr=8-1</td></tr>
<tr><td>http://www.amazon.com/Little-Green-Book-Getting-Your/dp/0131576070/ref=pd_bbs_2?ie=UTF8&s=books&qid=1196364228&sr=8-2</td></tr>
<tr><td>http://www.amazon.com/Learning-Curve-LC97727-Lamaze-Caterpillar/dp/B00009IMD8/ref=pd_bbs_3?ie=UTF8&s=baby-products&qid=1196364228&sr=8-3</td></tr>
<tr><td>http://www.amazon.com/Book-General-Ignorance-John-Mitchinson/dp/0307394913/ref=pd_bbs_sr_4?ie=UTF8&s=books&qid=1196364228&sr=8-4</td></tr>
<tr><td>http://www.amazon.com/The-Daring-Book-for-Girls/dp/B000UZJQNM/ref=sr_1_13?ie=UTF8&s=books&qid=1196364228&sr=8-13</td></tr>
<tr><td>http://www.amazon.com/Inconvenient-Book-Solutions-Biggest-Problems/dp/B000WJVLLG/ref=sr_1_16?ie=UTF8&s=books&qid=1196364228&sr=8-16</td></tr>
</table>
</center>

<p>We can easily see a pattern (the template) in those urls:</p>

<pre class="code">
http://www.amazon.com/&lt;title&gt;/dp/&lt;book id&gt;/ref=&lt;ref id&gt;?ie=UTF8&s=&lt;product type&gt;&qid=1196364228&sr=8-&lt;index&gt;
</pre>

<p>Our job is to find an algorithm that will allow us to automatically discover that template.</p>

<p>(I should mention that if one continues to other pages in the search results that there are more differences in the URLs that leads to a different pattern.  In order to keep the discussion simple, I will only use the first page's URLs.  This demonstrates that a single page, even one that contains multiple records, might not contain enough information to deduce a complete template.</p>

<p>I also should mention that the ability to label the fields in the template as I did in the example above will not be covered (in the near future) in this series of postings.  I will only state that there are techniques available in the literature on web wrappers for labeling fields.)</p>

<p>We are going to take two approaches to extract templates from URLs.  The first will use traditional <a href="http://en.wikipedia.org/wiki/Edit_distance">string edit distance</a> algorithms and the second will exploit the fact that there is underlying structure in the url (i.e. the scheme, authority, path, query, and fragment).</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Template Extraction</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000247.html" />
    <modified>2007-11-29T20:18:26Z</modified>
    <issued>2007-11-29T12:47:06-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.247</id>
    <created>2007-11-29T18:47:06Z</created>
    <summary type="text/plain">The goal of template extraction is to discover the template(s) (if there are any) used to generate a page. There are two major categories of templates: those that are used within a single web page (such as the individual search...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>The goal of template extraction is to discover the template(s) (if there are any) used to generate a page.  There are two major categories of templates:  those that are used <i>within</i> a single web page (such as the individual search results entries on a Google search result page) and those that are used <i>across</i> web pages (such as a story template used on CNN.com).  In the former case, the goal is to be able to extract the template(s) from any page that has at least two similar data records on it.  In the latter case, the goal is to be able to extract the template(s) from any two pages that have similar data records.  As more similar records or pages are found we expect our precision to increase.</p>

<p>By visually looking at a rendered Google search results page, for example, a human can easily see similar, repeated sections.  Although it is a bit more difficult, someone familiar with HTML can look at the source to that same Google page and find the repeated sections.  Similarly, there are two general approaches to automatic template discovery -- one that relies on the rendered representation of the page and one that relies on the source of the page.  A survey of the research literature does not show a distinct advantage of one technique over the other and in many cases the underlying algorithms are very similar.  I am going to focus on the latter approach since that is where my experience lies.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Template Extraction and Web Wrapper Generation</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000246.html" />
    <modified>2007-11-29T17:01:25Z</modified>
    <issued>2007-11-29T10:34:26-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.246</id>
    <created>2007-11-29T16:34:26Z</created>
    <summary type="text/plain">With the recent movement towards mashups, the semantic web and market intelligence, there is a large need to get at the data and information that is stored in web pages. Data extraction startups are popping up like weeds (e.g. InfoSquire...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>With the recent movement towards mashups, the semantic web and market intelligence, there is a large need to get at the data and information that is stored in web pages.  Data extraction startups are popping up like weeds (e.g. <a href="http://www.infosquire.com/">InfoSquire</a> and <a href="http://www.ql2.com/">QL2</a>).  Many of these startups focus on services where you specify what sites you want scraped and they provide you with the resulting data feed.  The technologies that they use are primarily rules-based (e.g. regex).  Rules-based systems are highly brittle given the dynamic nature of the web.  Specifically, there is a high maintenance cost to maintaining and monitoring the rules to ensure that they are up to date with any changes made to the underlying web pages.  The ability to automatically generate data extractors with high precision would be a vast improvement over a rules-based system.</p>

<p>Much research has gone into extracting data (either structured or unstructured) from a web page using web wrappers.  A web wrapper is a tool for "converting information implicitly stored as an HTML document into information explicitly stored as a data-structure for further processing" [<a href="http://db.cis.upenn.edu/DL/WWW8/index.html">W4F</a>].  One particular type of automatically generated web wrapper uses template extraction.  Template extraction is the inverse of creating a web page from a template -- for a given web page, attempt to deduce the template that was used to generate the page.  If a template can be generated for any (template derived) web page, then the data that populates that template can be easily extracted.</p>

<p>Over the next few weeks I am going to focus on machine learing and other automatic web wrapper technologies in a series of postings.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Excuse the mess</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000245.html" />
    <modified>2007-11-28T17:10:12Z</modified>
    <issued>2007-11-28T11:09:28-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.245</id>
    <created>2007-11-28T17:09:28Z</created>
    <summary type="text/plain">I&apos;m in the process of updating the blog template. Things will be a little goofy for a day or so....</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>General</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      I&apos;m in the process of updating the blog template.  Things will be a little goofy for a day or so.
      
    </content>
  </entry>
  <entry>
    <title>Amen</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000244.html" />
    <modified>2007-11-17T15:05:13Z</modified>
    <issued>2007-11-17T08:59:00-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.244</id>
    <created>2007-11-17T14:59:00Z</created>
    <summary type="text/plain">&quot;Testing by itself does not improve software quality. Test results are an indicator of quality, but in and of themselves, they don&apos;t improve it. Trying to improve Software quality by increasing the amount of testing is like trying to lose...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<div>"Testing by itself does not improve software quality. Test results are an indicator of quality, but in and of themselves, they don't improve it. Trying to improve Software quality by increasing the amount of testing is like trying to lose weight by weighing yourself more often. What you eat before you step onto the scale determines how much you will weigh, and the software development techniques you use determine how many errors testing will find. If you want to lose weight, don't buy a new scale; change your diet. If you want to improve your software, don't test more; develop better."</div>
<div style="text-align: right;"><i>Steve McConnell</i>, Code Complete</div>]]>
      
    </content>
  </entry>
  <entry>
    <title>In vogue languages</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000243.html" />
    <modified>2007-05-31T03:49:07Z</modified>
    <issued>2007-05-30T21:34:00-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.243</id>
    <created>2007-05-31T03:34:00Z</created>
    <summary type="text/plain">I&apos;m often asked why I don&apos;t hop on the lastest language bandwagon and just start coding up a storm. The answer comes in two parts: the first is that I do try out these languages to see what the hype...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Programming</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>I'm often asked why I don't hop on the lastest language bandwagon and just start coding up a storm.  The answer comes in two parts:  the first is that I do try out these languages to see what the hype is all about, to see where they can fit in and to see their pros and cons.  The second is that I realize that there is more to software engineering than just writing code.  Software spends disproportionately more time in maintenance than it does in initial development.  Just because a language such as Ruby is much faster for initial development doesn't mean that it's much easier to maintain.  (Do note that I'm not saying that Ruby is hard / harder to maintain.  I'm simply saying that a one cannot determine what the maintenance model for a language is from doing only initial development.)  The long and short of all of this is that I am forced by my professionalism and my responsibilites to not only look at how a language works for initial development but also for long term maintenance.  By definition this means that it takes me a very long time to determine if a language is suitable in the long term.  Since many newly in vogue languages simply haven't been out long enough to have either the community's or my own understanding of its maintenance model one simply cannot start writing production code with them.</p>
<p>One quick example of all of this is <a href="http://en.wikipedia.org/wiki/Aspect-oriented_programming">AOP</a>.  I'm enamored with AOP but I cannot and will not use it in production software.  The reason is that AOP simply does not have a maintenance model <i>at all</i>.  In other words, I cannot take an AOP'ified application and future apply AOP on it (i.e. maintain it) and have understandable and determinable effects.</p>

<p><i>Editor's note:  this is a stream of consciousness posting to get an idea down and is not complete or thorough in any way.  But as always, comments are welcome.</i></p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Sold!</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000240.html" />
    <modified>2007-12-03T21:41:28Z</modified>
    <issued>2007-02-08T08:46:07-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.240</id>
    <created>2007-02-08T14:46:07Z</created>
    <summary type="text/plain">I just stumbled on Live Clipboard and I&apos;m sold. The screencasts are quite enlightening (though for type-A people like myself very painful to sit through). For interested parties, the discussion archives are located here rather than the broken link from...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>I just stumbled on <a href="http://www.liveclipboard.org/">Live Clipboard</a> and I'm sold.  The <a href="http://spaces.live.com/editorial/rayozzie/demo/liveclip/screencast/liveclipdemo.html">screencasts</a> are quite enlightening (though for type-A people like myself very painful to sit through).</p>

<p>For interested parties, the discussion archives are located <a href="http://discussms.hosting.lsoft.com/archives/live-clip.html">here</a> rather than the broken link from the Live Clipboard site.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Yaargh!</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000239.html" />
    <modified>2007-01-23T15:25:31Z</modified>
    <issued>2007-01-23T09:09:46-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.239</id>
    <created>2007-01-23T15:09:46Z</created>
    <summary type="text/plain">I don&apos;t know about the rest of you but my parents baffle me. When working with them on issues that crop up from time to time I try to view them as two average people so that I can be...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>General</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>I don't know about the rest of you but my parents baffle me.  When working with them on issues that crop up from time to time I try to view them as two average people so that I can be objective about the situation and not allow decisions to be marred by emotion.  The recent situation that has arisen (and caused me to write this entry) revolves around my parent's view of planning.  The way my parent's approach a difficult and potentially costly planning problem is to simply worry.  "Huh?!?" you may be thinking.  You read that right.  Their solution to the a problem involving planning (and in fact most problems) is to worry.  What's funny (in an ironic sense) is that their "solution" tends to lead to the worst, most costly and most stressful conculsions which leads them to worry more.</p>

<p>I could elborate further on this topic but I'm still in the head shaking phase (i.e. "denial").  Yargh!</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Proverb</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000238.html" />
    <modified>2007-01-15T14:37:24Z</modified>
    <issued>2007-01-15T08:35:05-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.238</id>
    <created>2007-01-15T14:35:05Z</created>
    <summary type="text/plain">&quot;The person who says it cannot be done should not interrupt the person doing it.&quot; Chinese Proverb, 2006...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>General</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<div>"The person who says it cannot be done should not interrupt the person doing it."</div>
<div style="text-align: right;"><i>Chinese Proverb</i>, 2006</div>]]>
      
    </content>
  </entry>
  <entry>
    <title>Incompetent People Really Have No Clue</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000237.html" />
    <modified>2007-01-13T16:31:28Z</modified>
    <issued>2007-01-13T09:36:10-06:00</issued>
    <id>tag:www.realityinteractive.com,2007:/rgrzywinski/1.237</id>
    <created>2007-01-13T15:36:10Z</created>
    <summary type="text/plain">Incompetent People Really Have No Clue is an old article but is just as relevant today as it was 7 years ago. My wife is a public high school science teacher and she will periodically ask the students to submit...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Management</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p><a href="http://www.sfgate.com/cgi-bin/article.cgi?file=/chronicle/archive/2000/01/18/MN73840.DTL">Incompetent People Really Have No Clue</a> is an old article but is just as relevant today as it was 7 years ago.</p>

<p>My wife is a public high school science teacher and she will periodically ask the students to submit what they think that their grade is.  Her findings match those of the article:  poorer performing students tend to believe they are earning a grade significantly higher than is the case.  This results in the odd implication that one cannot expect these students to contribute more effort to achieving a higher grade since they already believe that they're done what is necessary to achieve that higher grade.  Without constant vigilance, one-on-one instruction, and continual family support to effectively force the student to exert the necessary effort to achieve the higher grade, the student will never know where the various rungs of the ladder lie and therefore will never know what subject mastery means.</p>

<p>I too have witnessed the same phenomenon working with developers over the years.  Developers that produce code with the most logical errors that requires the majority of effort to maintain tend to be those that believe highly in their abilities.  I have taken a few of these developers by the hand and walked them through a few full life-cycles of development demonstrating what is necessary to produce quality code (and to see the implications of poor quality code).  Most of my efforts were greeted with incredulous responses and outright denial of the effort involved but there has been an individual or two that has gained more understanding and used that experience to take their craft to a new level.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>Dapper</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000236.html" />
    <modified>2007-12-03T21:41:42Z</modified>
    <issued>2006-12-18T09:21:33-06:00</issued>
    <id>tag:www.realityinteractive.com,2006:/rgrzywinski/1.236</id>
    <created>2006-12-18T15:21:33Z</created>
    <summary type="text/plain">I have been doing some research on the Enterprise 2.0 landscape and web mashup tools. I stumbled across Dapper and their demo videos. It would be an understatement to say that I was impressed with what I saw. I have...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>I have been doing some research on the <a href="http://sloanreview.mit.edu/smr/issue/2006/spring/06/">Enterprise 2.0</a> landscape and <a href="http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)">web mashup</a> <a href="http://blogs.zdnet.com/Hinchcliffe/?p=63">tools</a>.  I stumbled across <a href="http://www.dappit.com/index.php">Dapper</a> and their <a href="http://www.dappit.com/dapperDemo/">demo</a> <a href="http://www.dappit.com/dapperDemo/aggregatorAid.php">videos</a>.  It would be an understatement to say that I was impressed with what I saw.  I have no knowledge about the stability, performance, and actual usability of their product but I am certainly excited about what they're promising.  Their <a href="http://dapper.wordpress.com/">blog</a> has some information about new features and possible uses for those features (such as the <a href="http://dapper.wordpress.com/2006/11/05/login-dapps/">Login Dapps</a>).  This company is obviously ripe for the buying and I hope that someone that has a history of realizing acquired companies brings this product to full maturity.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>StreamCruncher 1.0, a lightweight Event Processing Kernel</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000235.html" />
    <modified>2007-11-29T18:17:14Z</modified>
    <issued>2006-12-15T07:08:52-06:00</issued>
    <id>tag:www.realityinteractive.com,2006:/rgrzywinski/1.235</id>
    <created>2006-12-15T13:08:52Z</created>
    <summary type="text/plain">StreamCruncher is an Event Processor. It supports a language based on SQL which allows you to define Event Processing constructs like Sliding Windows, Time Based Windows, Partitions and Aggregates. Queries can be written using this language, which are used to...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<div class="quote">StreamCruncher is an Event Processor. It supports a language based on SQL which allows you to define Event Processing constructs like Sliding Windows, Time Based Windows, Partitions and Aggregates. Queries can be written using this language, which are used to monitor streams of incoming Events. StreamCruncher is a multi-threaded Kernel that runs on Java&trade;.</div>

<p>Check out <a href="http://www.streamcruncher.com/">StreamCruncher</a> for more info.</p>]]>
      
    </content>
  </entry>
  <entry>
    <title>C to MIPS to Java bytecode</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000234.html" />
    <modified>2006-12-15T13:04:18Z</modified>
    <issued>2006-12-15T07:02:20-06:00</issued>
    <id>tag:www.realityinteractive.com,2006:/rgrzywinski/1.234</id>
    <created>2006-12-15T13:02:20Z</created>
    <summary type="text/plain">I refer you to binkley&apos;s BLOG for information regarding translating C/C++ code to Java....</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>Development</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>I refer you to <a href="http://binkley.blogspot.com/2006/12/brilliant-c-to-mips-to-java-bytecode.html">binkley's BLOG</a> for information regarding translating C/C++ code to Java.</a>]]>
      
    </content>
  </entry>
  <entry>
    <title>Rollback FireFox</title>
    <link rel="alternate" type="text/html" href="http://www.realityinteractive.com/rgrzywinski/archives/000232.html" />
    <modified>2006-11-16T13:12:04Z</modified>
    <issued>2006-11-16T07:10:43-06:00</issued>
    <id>tag:www.realityinteractive.com,2006:/rgrzywinski/1.232</id>
    <created>2006-11-16T13:10:43Z</created>
    <summary type="text/plain">After two hard windows crashes and horrible performance enough was enough with FireFire 2.0. Rolling back to 1.5 was simple and neither my bookmarks nor my profile were lost. This link provided me with enough information to get the job...</summary>
    <author>
      <name>rgrzywinski</name>
      <url>http://www.realityinteractive.com/rgrzywinski</url>
      <email>rgrzywinski@yahoo.com</email>
    </author>
    <dc:subject>General</dc:subject>
    <content type="text/html" mode="escaped" xml:lang="en" xml:base="http://www.realityinteractive.com/rgrzywinski/">
      <![CDATA[<p>After two hard windows crashes and horrible performance enough was enough with FireFire 2.0.  Rolling back to 1.5 was simple and neither my bookmarks nor my profile were lost.  This <a href="http://www.dslreports.com/forum/remark,17216648">link</a> provided me with enough information to get the job done.</p>]]>
      
    </content>
  </entry>

</feed>