Category: Featured Posts

Top Ten Tips for Writing Plumtree Crawlers that Actually Work

Post author By Christopher Bucchere
Post date October 26, 2008
3 Comments on Top Ten Tips for Writing Plumtree Crawlers that Actually Work

Just in time for Halloween, I’ve decided to publish my Top Ten Tips for Writing Plumtree Crawlers that Actually Work. This post may scare you a little bit, but hey, that’s the spirit of Halloween, right?

[Editor’s note: yes, we’re still calling it Plumtree. Why? I did a Google search today and 771,000 hits came up for “plumtree” as opposed to around 300,000 for “aqualogic” and just over 400,000 for “webcenter.” Ignoring the obvious — that a short, simple name always wins over a technically convoluted one — it just helps clarify what we’re talking about. For example, if we say “WebCenter,” no one knows whether we’re talking about Oracle’s drag-n-drop environment for creating JSR-168 portlets (WebCenter Suite) or Plumtree’s Foundation/Portal (WebCenter Interaction). So, frankly, you can call it whatever you want, but we’re still gonna call it Plumtree so that people will know WTF we’re talking about.]

So, you want to write a Plumtree Crawler Web Service (CWS), eh?

Here are ten tips that I learned the hard way (i.e. by NOT doing them):

1. Don’t actually build a crawler
2. If you must, at least RTFM
3. “Hierarchyze” your content
4. Test first
5. When testing, use the Service Station 2.0 (if you can get it)
6. Code for thread safety
7. Write DRY code (or else)
8. Don’t confuse ChildDocument with Document
9. Use the named methods on DocumentMetaData
10. RTFM (again)

Before I get into the gory details, let me give you some background. First off, what’s a CWS anyway? It’s the code behind what Oracle now calls Content Services, which spider through various types of content (for lack of a better term) and import pointers to those bits of content into the Knowledge Directory. This ability to spider content and normalize its metadata is one of the most underrated features in Plumtree. (FYI, it was also the first feature we built and arguably, the best.)

Each bit of spidered content is called a Document or a Card or a Link depending on whether you’re looking at the product, the API or the documentation, respectively. It’s important to realize that CWSs don’t actually move content into Plumtree; rather, they store only pointers/links and metadata and they help the Plumtree search engine (known under the covers as Ripfire Ignite) build its index of searchable fulltext and properties.

Today, Plumtree ships with one OOTB CWS that knows how to crawl/spider web pages. Not surprisingly, it’s known as the Web Crawler. Don’t let the name mislead you: the web crawler can actually crawl almost anything, as I explain in my first tip, which is:

Don’t actually build a crawler.

But I’m getting ahead of myself.

So, back to the background on crawlers. Oracle ships five of ’em, AFAIK: one for Windows files, one for Lotus Notes databases, one for Exchange Public Folders, one for Documentum and one for Sharepoint. Their names give you blatantly obvious hints at what they do, so I won’t get into it. Along with the OOTB crawlers, Oracle also exposes a nice, clean API for writing crawlers in Java or .NET. (If you really want to push the envelope, you can try writing a crawler in PHP, Python, Ruby, Perl, C++ or whatever, but it’s hard enough to write one in Java or .NET, so I wouldn’t go there. If you do, though, make sure that your language has a really good SOAP stack.)

So, after reading this, you still want to write a crawler, yes?

Let’s get into my Top Ten Tips:

1. Don’t actually build a crawler

Yes, you really don’t want to go here. Building crawlers is not that hard, as there’s a clean, well documented API. However, getting them work is a whole other story.

Most applications these days have a web UI. So, take advantage of it. Point the OOTB web crawler at the web UI and see what it does. Some web UIs will work well, other won’t (particularly if they use frames or lots of javascript.)

Let’s assume for a moment that this technique doesn’t work. Or perhaps you’re dealing with some awful client-server or greenscreen POS that doesn’t have a web UI. Either way, you may still be able to use the web crawler.

How? Well, the web crawler can crawl almost anything using something that we used to call the Woogle Web. Think of it this way. Say you want to crawl a database. Perhaps that database contains bug reports. Rather than waste your time trying to write a database crawler, just write an index.jsp (or .php, .aspx, .html.erb, .you-name-it) page that does something like select id from bugs and dumps out a list of all the bugs in the database. Then, hyperlink each one to a detail page (that’s essentially a select * from bugs where id = ? query). Your index page can be sinfully ugly. However, put some effort into your detail pages, making them look pretty AND using meta tags to map each field in the database to its value.

Then, simply point the OOTB web crawler at your index page, tell it not to import your index page, map your properties to the appropriate meta tags, crawl at a depth of one level, and get yourself a nice cup of coffee. By the time you get back, the OOTB web crawler will have created documents/links/cards for every bug with links to your pretty detail pages and every bit of text will be full-text searchable. So will every property, meaning that you can run an advanced search on bug id, component, assignee, severity, etc.

At Plumtree, we used to call this a Woogle Web. It may sound ridiculous, but

a Woogle Web is a great way to crawl virtually anything without lifting a finger.

However, a Woogle Web won’t work for everything. If you’re dealing with a product where you can’t even get your head around the database schema AND you have a published Java, .NET or Web Service API, then you might want to think about writing a custom crawler.

2. If you must, at least RTFM

If you’re anything like me, reading the manual is what you do after you’ve tried everything else and nothing has worked out. In the case of Plumtree crawlers, I recommend swallowing your pride (at least momentarily) and reading all of the documentation, including their “tips” (which are totally different from and not nearly as entertaining as my tips, but equally valuable).

Once you’re done reading all the documentation, you might also want to consult Tip #10.

3. “Hierarchyze” your content

Um, yeah, I know “hierarchyze” isn’t a word. But since crawlers only know how to crawl hierarchies, if your data aren’t hierarchical, you darn well better figure out how to represent them hierarchically before you start writing code. Even if you don’t think this step is necessary, just do it because I said so for now. You’ll thank me later.

4. Test first

Don’t even try to write your crawler and then run it and expect it to work. Ha! Instead, write unit tests for every modular bit of code that you throw down. To every extent that’s it’s possible, write these tests first. It’ll save your butt later.

5. When testing, use the Service Station 2.0 (if you can find it)

When you do finally get around to integration testing your crawler, it’ll save you a lot of time if you use the Service Station 2.0. However, it may take you a long time to get it, so start the process early.

Unlike every product Oracle distributes, Service Station is 1) free and 2) not available for download. Yes, you read that correctly.

To get it, you need to contact support. I called them and told them I needed it and after two weeks I got nothing but a confused voicemail back saying that support doesn’t fulfill product orders. Um, yeah. So I called back and literally begged to talk to someone who actually knew what this thing was. Then I painstakingly explained why I couldn’t get it from edelivery (because it’s not there) nor from commerce.bea.com (because it’s permanently redirecting to oracle.com) nor from one.bea.com (because there it says to contact support). So, after my desperate pleas and 15 calendar days of waiting, I got an e-mail with an FTP link to download the Service Station 2.0.

After installing this little gem, my life got a lot easier. Now instead of testing by launching a Plumtree job to kick off the crawler (and then watching it crash and burn), I could use the Service Station to synchronously invoke each method on the crawler and log the results.

Another handy testing tool is the PocketSOAP TCPTrace utility. (It’s also very handy for writing Plumtree portlets.) You can set it up between the Service Station and your CWS and watch the SOAP calls go back and forth in clear text. Very nice.

6. Code for thread safety

So, as the documentation says (and as I completely ignored), crawlers are multithreaded. The portal will launch several threads against your code and, unless you code for thread safety, these threads will proceed to eat your lunch.

Coding for threadsafety means not only that you need to synchronize access to any class-level variables, but also that you must use only threadsafe objects (e.g. in Java, use ArrayList instead of Vector).

7. Write DRY code (or else)

Even though you’re probably writing your CWS in Java or .NET, stick to the ol’ Ruby on Rails adage:

Don’t Repeat Yourself.

Say for example, that you need to build a big ol’ Map of all the Document objects in order to retrieve a document and send its metadata back to the mothership (Plumtree). It’s really important that you don’t build that map every time IDocumentProvider.attachToDocument is called. If you do, your crawler is going to run for a very very very long time. Crawlers don’t have to be super fast, but they shouldn’t be dogshit slow either.

As a better choice, build the Map the first time attachToDocument is called and store it as a class-level variable. Then, with each successive call to attachToDocument, check for the existence of the Map and, if it’s already built, don’t build it again. And don’t forget to synchronize not only the building of the Map, but also the access to the variable that checks whether the Map exists or not. Like I said, this isn’t a walk in the park. (See Tip #1.)

8. Don’t confuse ChildDocument with Document

IContainer has a getChildDocuments method. This, contrary to how it looks on the surface, does not return IDocument objects. Instead, it expects you to return an array of ChildDocument objects. These, I repeat, are not IDocument objects. Instead, they’re like little containers of information about child documents that Plumtree uses so that it knows when to call IDocumentProvider.attachToDocument. It is that call (attachToDocument) and not getChildDocuments that actually returns an IDocument object, where all the heavy lifting of document retrieval actually gets done.

You may not understand this tip right now, but if drop by and read it again after you’ve tried to code against the API for a few hours, and it should make more sense.

9. Use the named methods on DocumentMetaData

This one really burned me badly. I saw that DocumentMetaData had a “put” method. So, naturally, I called it several times to fill it up with the default metadata. Then I wasted the next two hours trying to figure out why Plumtree kept saying that I was missing obvious default metadata properties (like FileName). The solution? Call the methods that actually have names like setFileName, setIndexingURL, etc. — don’t use put for standard metadata. Instead, only use it for custom metadata.

10. RTFM (again)

I can’t stress the importance of reading the documentation enough.

If you think you understand what to do, read it again anyway. I guarantee that you’ll set yourself up for success if you really read and thoroughly digest the documentation before you lift a finger and start writing your test cases (which of course you’re going to write before you write your code, right?).

* * *

As always, if you get stuck, feel free to reach out to the Plumtree experts at bdg. We’re here to help. But don’t be surprised if the first thing we do is try to dissuade you from writing a crawler.

Have a safe and happy Halloween and don’t let Plumtree Crawlers scare you more than they should.

Boo!

Tags bea, crawlers, cws, enterprise software, how to, oracle, portlets

Nobody’s Gonna Read This (and Why That Makes Me Happy)

Post author By Christopher Bucchere
Post date August 15, 2008
4 Comments on Nobody’s Gonna Read This (and Why That Makes Me Happy)

Boy do I love the fact that no one reads this blog. And to the few people who are exceptions to that general rule — thank you for being so supportive!

I just hit two or three web pages in a row (TechCrunch, Digg and the Meebo blog) wherein each post I read had 80+ comments that reminded me why I rarely ever actually read comments.

Haters, trolls, flamers, spammers — whatever you want to call them, the internet is ridden with people who are filled with spite and rage. The funny thing is that in no other forum (except for perhaps while driving) are people this cruel to one another. It’s just not socially acceptable.

I realize that e-hate isn’t a new problem: in fact, it dates back to the early days of UseNet, Netiquette and the ol’ “do we allow AOLer’s on the internet” debate. While doing some fact-checking on wikipedia, I was really amused to read about Godwin’s Law, which sums up what I’m talking about better than I ever could: “As a Usenet discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one.”

We all know the Kathy Sierra story. I’m glad she had a thick enough skin to re-emerge in the blogging world and on Twitter because the world is a better place with her contributions than it is without them.

We all remember The Great Sarah Lacy Twitter Massacre of SXSW 2008. I recently met Sarah at a tech event in DC and, believe it or not, she doesn’t have horns, literally or figuratively.

Jason Calacanis recently “retired” from blogging. When I read his post, I immediately thought that it was just a PR stunt, but I’m beginning to realize that I can sympathize with his viewpoint. I really don’t want to ever be an A-list blogger or “internet famous” because it’s just like painting a big target on your own ass.

I love my family and close friends, I love the physical neighborhood in which I live and I love the virtual networks that have developed around my career and my passions for the past 15 years or so that I’ve been using the internet.

But honestly, a big part of me doesn’t want anyone else to read this. Not because I don’t take criticism well. (I don’t, but then again nobody does.) I just wish some of the same general rules that apply to social interactions — at say, a cocktail party, a baseball game or at the supermarket — would apply to the internet.

Comments welcome. Just be nice, ok?

Tags blogging, comments, hate, social media, twitter, web 2.0

bdg dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

Write an ALUI IDS in Under 15 Lines Using Ruby on Rails

Post author By Christopher Bucchere
Post date January 18, 2008
No Comments on Write an ALUI IDS in Under 15 Lines Using Ruby on Rails

Not only is it possible to write an ALUI Identity Service in Ruby on Rails, it’s remarkably easy. I was able to do the entire authentication part in fewer than 15 lines of code! However, I ran into problems on the synchronization side and ended up writing that part in Java. Read on for all the gory details.

As part of building the suite of social applications for BEA Participate 2008, we’re designing a social application framework in Ruby on Rails and integrating it with ALI 6.5. Not being a big fan of LDAP, I decided to put the users of the social application framework in the database (which is MySQL). Now, when we integrate with ALI, we need to sync this user repository (just as many enterprises do with Active Directory or LDAP).

So I set out to build an IDS to pull in users, groups and memberships in Ruby on Rails.

It’s pretty obvious that Ruby on Rails favors REST over SOAP for their web service support. However, they still support SOAP for interoperability and it mostly works. I did have to make one patch to Ruby’s core XML processing libraries to get things humming along. I haven’t submitted the patch back to Ruby yet, but at some point I will. Basically, the problem was that the parser didn’t recognize the UTF-8 encoding if it was enclosed in quotes (“UTF-8”). This patch suggestion guided me in the right direction, but I ended up doing something a little different because the suggested patch didn’t work.

I changed line 27 of lib/ruby/1.8/rexml/encoding.rb as follows:

 enc = enc.nil? ? nil : enc.upcase.gsub('"','') #that's a double quote inside single quotes

Now that Ruby’s XML parser recognized UTF-8 as a valid format, it decided that it didn’t support UTF-8! To work around this, I installed iconv, which is available for Windows and *nix and works seamlessly with Ruby. In fact, after installation, all the XML parsing issues went bye-bye.

Now, on to the IDS code. From your rails project, type:

ruby script/generate web_service Authenticate

This creates app/apis/authenticate_api.rb. In that file, place the following lines of code:

class AuthenticateApi < ActionWebService::API::Base
 api_method :Authenticate, :expects => [{:Username =>
:string}, {:Password =>
:string}, {:NameValuePairs =>
[:string]}], :returns =>
[:string]
end

All you’re doing here is extending ActionWebService and declaring the input/output params for your web service. Now type the following command:

ruby script/generate controller Authenticate

This creates the controller, where, if you stick with direct dispatching (which I recommend), you’ll be doing all the heavy lifting. (And there isn’t much.) This file should contain the following:

class AuthenticateController < ApplicationController
 web_service_dispatching_mode :direct
 wsdl_service_name 'Authenticate'
 web_service_scaffold :invoke

 def Authenticate(username, password, nameValuePairs)
   if User.authenticate(username, password)
     return ""
   else
     raise "-102" #generic username/password failure code
   end
 end
end

Replace User.authenticate with whatever mechanism you’re using to authenticate your users. (I’m using the login_generator gem.) That’s all there is to it! Just point your AWS to http://localhost:3000/authenticate/api and you’re off to the races.

Now, if you want to do some functional testing (independently of the portal), rails sets up a nice web service scaffold UI to let you invoke your web service and examine the result. Just visit http://localhost:3000/authenticate/invoke to see all of that tasty goodness.

There you have it — a Ruby on Rails-based IDS for ALUI in fewer than 15 lines of code!

The synchronization side of the IDS was almost just as simple to write, but after countless hours of debugging, I gave up on it and re-wrote it in Java using the supported ALUI IDK. Although I never could quite put my finger on it, it seemed the problem had something to do with some subtleties about how BEA’s XML parser was handing UTF-8 newlines. I’ll post the code here just in case anyone has an interest in trying to get it to work. Caveat: this code is untested and currently it fails on the call to GetGroups because of the aforementioned problems.

In app/apis/synchronize_api.rb:

class SynchronizeApi < ActionWebService::API::Base
 api_method :Initialize, :expects =>
[{:NameValuePairs =>
[:string]}], :returns =>
[:integer]
 api_method :GetGroups, :returns =>
[[:string]]
 api_method :GetUsers, :returns =>
[[:string]]
 api_method :GetMembers, :expects =>
[{:GroupID => :string}], :returns =>
[[:string]]
 api_method :Shutdown
end

In app/controllers/synchronize_controller.rb:

class SynchronizeController < ApplicationController
  web_service_dispatching_mode :direct
  wsdl_service_name 'Synchronize'
  web_service_scaffold :invoke

  def Initialize(nameValuePairs)
    session['initialized'] = true
    return 2
  end

  def GetGroups()
    if session['initialized']
      session['initialized'] = false
      groups = Group.find_all
      
      groupNames = Array.new
      for group in groups
        groupNames << "<SecureObject Name=\"#{group.name}\" AuthName=\"#{group.name}\" UniqueName=\"#{group.id}\"/>" 
      end 
      return groupNames
    else
      return nil
    end
  end
  
  def GetUsers()
    if session['initialized']
      session['initialized'] = false
      users = User.find_all
      
      userNames = Array.new
      for user in users
        userNames << "<SecureObject Name=\"#{user.login}\" AuthName=\"#{user.login}\" UniqueName=\"#{user.id}\"/>" 
      end
      
      return userNames
    else
      return nil
    end
  end

  def Shutdown()
    return nil
  end
end

Comments

Comments are listed in date ascending order (oldest first)

Nice post, Chris. This is the first time I’ve seen this done!
Posted by: dmeyer on January 20, 2008 at 4:16 PM

Thank you, David.I just noticed that part of my sync code was chomped off in the blog post because WordPress was assuming that was actually an opening HTML/XML tag. I made the correction so the above code now accurately reflects what I was testing.
Posted by: bucchere on January 21, 2008 at 1:16 PM

Tags api, bea, enterprise software, enterprise web 2.0, how to, ids, milestones, native, open standards, oracle, ruby on rails, sdk

bdg dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

The Enterprise Relevance of Web 2.0

Post author By Christopher Bucchere
Post date October 27, 2007
No Comments on The Enterprise Relevance of Web 2.0

The concepts behind Web 2.0, social networks, and collaboration are now poised to transform your enterprise, providing solutions such as collaborative mashups, expertise discovery and social search to enhance your existing portal.

According to Gartner, Web 2.0 will have a major impact on a broad range of traditional enterprises. Gartner states that “positive business model change will result in unexpected ways, and enterprises must prepare for this transition.”

Register to attend this exciting seminar on Wednesday, November 14th, 6:00 pm and hear how BEA’s three new products will “two-dot-oh” your company’s Web along with other topics that include:

How Web 2.0 can bring true value to your business
How to differentiate between Web 2.0 and Enterprise 2.0
How to implement new Web 2.0 concepts like blogging, wikis, tagging and social networking into your business and allow IT governance and control
How to enhance your existing portal infrastructure

Enjoy free hors d’oeuvres and an open bar along with presentations that define Web 2.0 and show how BEA’s new social computing products Pages, Ensemble and Pathways can deliver true business value from Web 2.0 and bdg’s newest products that bridge the gap between Web 2.0 and the enterprise.

Attendance is limited, so please take a moment to register now. I look forward to meeting you at the event.

Date: Wednesday, November 14th
Time: 6:00 p.m. – 8:00 p.m.
Location: Marriott Tyson’s Corner

8028 Leesburg Pike
Vienna, VA, 22182
(703) 734-3200

Directions

Comments

Comments are listed in date ascending order (oldest first)

I’m sorry I missed this! If you have a notification list for events like these please include me, I’d love to hear about future events you guys sponsor. [email protected] Thanks!
Posted by: geoffgarcia on January 17, 2008 at 1:33 PM

Hi Geoff! The event was down here in Tyson’s Corner, VA, so we focused on local attendees. I’ll make sure to include you next time, even though if my memory serves me correctly, you’re up in NY.
Posted by: bucchere on January 17, 2008 at 6:47 PM

Oh, I almost forgot. If I can find the time, I’ll put together a video podcast of the event. I have the footage; I just haven’t had the time to do the editing. 🙁
Posted by: bucchere on January 17, 2008 at 6:48 PM

Tags bea, conferences, enterprise software, enterprise web 2.0, speaking engagements, training, web 2.0

bdg Business dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

Predictions: Will Oracle Acquire BEA?

Post author By Christopher Bucchere
Post date October 12, 2007
No Comments on Predictions: Will Oracle Acquire BEA?

There’s been a lot of speculation in response to some press releases from Oracle that an all-cash buyout of BEA may be immanent. More than two years ago, I made an entry on my company’s blog that said, effectively, that by acquiring Plumtree, BEA painted a target on itself to be acquired by Oracle. Here’s the snippet from my other blog dated August 28, 2005:

Will this deal make BEA even more of an acquisition target for Oracle?

Everyone I know — myself included — had a feeling that Plumtree would be acquired some day. But the major questions were 1) when and 2) by whom? Quite some time ago and long before Plumtree had its Java strategy fleshed out, there were rumors of a Microsoft takeover. Then Siebel. Then Peoplesoft. But BEA? I never would have guessed.

I personally thought Oracle would be the suitor, especially after they acquired Oblix, PeopleSoft and J.D. Edwards. After extending its tentacles into almost every enterprise software market (and proving tremendously incapable of producing any decent software applications other than a database), Oracle snapped up ERP, HR and SSO/Identity Management in the blink of an eye. It seemed reasonable to me that a good portal product that could integrate with all those applications would be a clear next target. Oracle’s portal certainly doesn’t cut the mustard. In fact, they often offer it up for free only to be beaten out by Plumtree, which is, ahem, a far cry from free.

Now the next pressing question: is Oracle even more likely to acquire Plumtree now that they’re a part of BEA? Now they’d get an excellent application server and a cross-platform, industry-leading portal. You know it crossed Larry Ellison’s mind when he heard the news. Food for thought.

I also said that BEA would keep the name Plumtree and lo-and-behold, they changed it to AquaLogic. So I wasn’t 100% right, but at least I can say that I called this one.

Comments

Comments are listed in date ascending order (oldest first)

Someone just walked into my office and said, “Hey, since BEA already has a dual portal strategy (ALI and WLP), what will happen if they get acquired by Oracle, which already has their own portal product?”
Two years ago, I predicted a merging of WLP and ALI, with the result being much like ALI with the great developer tools you get from WLP and workshop tacked on to it. Obviously that’s not exactly how things played out.

So my prediction this time is that all three portals will “seamlessly” co-exist under one roof, giving consumers plenty of ways to portalize all under the Oracle name. We’ll call it the Portal Trifecta — w00t!

Posted by: bucchere on October 12, 2007 at 10:40 AM

Oracle is going to support SqlServer 2000 & 2005 for Aqualogic? And support .NET? Interesting if they would sell the Aqualogic piece of to to Microsoft. Give MOS a better external portal….?
Posted by: vivekvp on October 12, 2007 at 11:37 AM

Great question, Vivek. I was surprised to see BEA pledge support for ALUI on .NET and SQL Server. I’ll be even more surprised to see that happen over at Oracle. Remember though, Oracle runs on Windows!
Posted by: bucchere on October 12, 2007 at 12:08 PM

Chris, don’t you mean 4 portal products; ALUI, WLP, Oracle Portal, and WebCenter? The merger makes a lot of sense from my view point, but in all seriousness the one area which will need a lot of help is Portal. IBM has only one WebSphere Portal code base.
Posted by: Dr. BEA Good on October 16, 2007 at 9:33 PM

It’s hard to image that a company maintains three or four full-featured portal products, even a giant like IBM, Oracle or MS.
Posted by: caiwenliang on October 17, 2007 at 5:16 AM

Four portals? Yikes! I just don’t want confused consumers to go off and buy Sharepoint or WebSphere portal when I think ALUI and WLP are superior products.
Posted by: bucchere on October 18, 2007 at 2:11 PM

Tags analysis, bea, customers, enterprise software, oracle, shameless self-promotion

Say hello world to comet

A couple of weekends ago I inflicted upon myself a quest to discover what all the buzz was about regarding Comet. What I discovered is that there is quite a bit of code out there to help you get started but the documentation around that code, and about Comet in general, is severely lacking. All I really wanted to find was a Comet-based Hello World, which as any developer knows, is the cornerstone of any programming language or methodology.

Since I couldn’t find one on Google, I ascertained that no Hello World exists for Comet and therefore I took it upon myself to write one.

For those of you who are new to Comet, the first thing you should do is read Alex Russell’s seminal blog post on the topic. At its core, Comet is really just a message bus for browser clients. In one browser, you can subscribe to a message and in another you can publish a message. When a message gets published, every browser that’s subscribed (almost) instantaneously receives it.

What? I thought clients (browsers) had to initiate communication per the HTTP spec. How does this work?

Under the covers, Comet implementations use a little-known feature of some web server implementations called continuations (or hanging gets). I won’t go into details here, but at a high level, a continuation initiates from the browser (as all HTTP requests must do) and then, when it’s received by the server, the thread handling it basically goes to sleep until it gets a message or times out. When it times out, it wakes up and sends a response back to the browser asking for a new request. When the thread on the server receives a message, it wakes up and sends the message payload sent back to the browser (which also implies that it’s time to send a new request). Via this mechanism, HTTP is more or less “inverted” so that the server is essentially messaging the client instead of vice-versa.

A few questions immediately pop into mind, so let’s just deal with them right now:

Why is this better than Ajax alone?

It boils down to latency and users’ tolerance for it. In the worst case, traditional web applications force entire page refreshes. Ajax applications are a little better, because they can refresh smaller parts of a page in response to users’ actions, but the upshot is that the users are still waiting for responses, right? A Comet-driven application has essentially removed the user from the picture. Instead of the user asking for fresh data, the server just sends it along as soon as it changes, given the application more of a “realtime” feel and removing virtually all perceived latency.

So are we back to client server again?

Sort of. Comet gives you the benefit of server-to-client messaging without the deployment issues associated with fat clients.

Can’t applets do this?

Of course they can. But who wants to download an applet when some lightweight Javascript will do the trick?

Why the name Comet?

Well, clearly it’s a pun on Ajax. But it’s not the only name for this sort of technology. There’s something out there called pushlets which claims to do the same thing as Comet, but which didn’t seem to catch on, I guess.

Back to the whole point of this post: my hello world. I pieced this example together using dojo.io.cometd and a recent version of Tomcat that into which I dropped the relevant parts of Jetty to provide support for continuations.

It’s finally time to say “hello world” to my hello world.

First off, download one of the more recent dojo builds that contains support for dojo.io.cometd. Drop dojo.js on your Java-based web/application server. (I used Tomcat, but you can use JBoss, Jetty, Weblogic, Websphere or any other web server with support for servlets.) Add this page in the root of your application:

<script src="js/dojo.js" type="text/javascript"></script>
<script type="text/javascript">//<![CDATA[
  dojo.require("dojo.io.cometd");
  cometd.init({}, "cometd");
  cometd.subscribe("/hello/world", false, "publishHandler");
  publishHandler = function(msg) { alert(msg.data.test); }
// ]]></script>
<input type="button" value="Click Me!" />

Without a cometd-enabled web server behind it, the above page does absolutely nothing.

So, to make this work, I needed to find a Java-based web/application server with support for continuations. I’m sure there are many ways to skin this cat, but I picked Jetty. You can get Jetty source and binaries if you’d like to follow along. Since all of our customers who embrace open source are lightyears more comfortable with Tomcat than they are with any other open source web/application server (ahem . . . Jetty), I decided to embed Jetty in Tomcat rather than run everything on Jetty alone. It’s all just Java, right?

Here I ran into a few snags. The maven build file for Jetty didn’t work for me, so I dropped everything in org.mortbay.cometd and org.mortbay.cometd.filter into my Eclipse project and just integrated it with the ant build.xml I was already using to build my web application. Here’s the relevant portion of my build.xml:

<javac srcdir="${srcdir}" destdir="${classdir}" debug="true" debuglevel="lines,vars,source">
<classpath>
<pathelement location="${jetty.home}/lib/jetty-util-6.0.1.jar"/>
<pathelement location="${jetty.home}/lib/servlet-api-2.5-6.0.1.jar"/>
</classpath>
</javac>

Once Jetty was essentially hacked into Tomcat, the rest was smooth sailing. I just wrote a JSP that dropped a “goodbye world” message onto the same old queue that I used in the last example, but I did so using server-side code. Here’s the JSP:

<%@page import="org.mortbay.cometd.*"%>
<%@page import="java.util.*"%>
<%
Bayeux b = (Bayeux)getServletContext().getAttribute(CometdServlet.ORG_MORTBAY_BAYEUX);
Channel c = b.getChannel("/hello/world");
Map message = new HashMap();
message.put("test", "goodbye world");
c.publish(message, b.newClient());
%>

This page does not produce any output of its own; rather, it just drops the “goodbye world” message on the queue. When you hit this page in a browser, any other browser listening to the /hello/world queue will get the message. The above JSP, along with the dojo page you created in the first step, should be enough to wire together two different flavors of Comet messaging: browser to server to browser and just plain old server to browser.

I’m curious 1) if this was helpful and 2) if you’d like to share what you’re doing with Comet with me (and please don’t say cleaning your kitchen).

Tags ajax, comet, hacks, how to, jetty, tomcat

bdg Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

BEAWorld 2006 Speaking Engagement

Post author By Christopher Bucchere
Post date September 15, 2006
No Comments on BEAWorld 2006 Speaking Engagement

This just in: my talk on ALUI Taglibs begins at 1:50 PM on Monday the 18^th at the ALUI Developer User Group Meeting, which will be held in one of the rooms on the 120 block of Moscone Center. See you there! (Be sure to come up and introduce yourself.)

Tags bea, conferences, speaking engagements

bdg dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

UUID Object Opener, The Coolest ALI Taglib Yet

Post author By Christopher Bucchere
Post date September 12, 2006
No Comments on UUID Object Opener, The Coolest ALI Taglib Yet

Anyone who’s ever done a major Plumtree/ALUI deployment knows of this problem: You create a portlet or community (or any other object) in Dev and then you migrate it to Test and on to Production. The problem is that you’ve also written some code in your navigation portlet or in another portlet that depends on an ObjectID (e.g. you’ve used a pt:standard:opener tag) and now, in each environment, your ObjectID has changed and you’re basically hosed.

Pre-G6, I came up with a solution described (somewhat hastily) in this post, but it requires a lot of leg work and — worse yet — manual configuration in each environment.

Enter G6 and the magic of taglibs. (Am I beginning to sound like a broken record? Yes, I know, you can’t fix every problem with a taglib, just 95% of them, right?) With this new taglib I wrote today, I extend AOpenerLinkTag and simply convert a UUID to an ObjectID and ClassID so that you can use the same taglib invocation in every environment. I don’t want to toot my own horn too much here, but honestly, this is pretty much the most useful taglib I’ve ever encountered, and once again, it took under 30 minutes to write.

Before I dive into the source, let me back up and say that I had to bend the rules a bit. OOTB, there are two subclasses of ATagAttribute: RequiredTagAttribute and OptionalTagAttribute. I added a third: MutableTagAttribute. It looks and smells like a tag attribute, but under the covers it’s not. Instead of grabbing its value out of the tag invocation, it allows you to set/change the value at runtime inside the taglib code. Granted, this is a little weird, but it’s what I needed to do in order to subclass AOpenerLinkTag and keep it happy dappy.

MutableTagAttribute.java:

package com.bdgportal.alui.taglibs;

import com.plumtree.portaluiinfrastructure.tags.metadata.*;

public class MutableTagAttribute extends ATagAttribute {

  private String value;
  
  public MutableTagAttribute(String name, String desc, AttributeType type) {
    super(name, desc, type);
  }
  
  public String GetDefaultValue() {
    return value;
  }

  public void SetDefaultValue(String value) {
    this.value = value;
  }
  
  public boolean GetIsRequired() {
    return false;
  }
}

Now that we have a tag attribute that we can change on-the-fly, writing the taglib was a snap.

UUIDObjectOpener.java:

package com.bdgportal.alui.taglibs;

import com.plumtree.portaluiinfrastructure.tags.*;
import com.plumtree.portaluiinfrastructure.tags.metadata.*;
import com.plumtree.xpshared.htmlelements.*;
import com.plumtree.taglib.standard.basetags.*;
import com.plumtree.server.*;

public class UUIDObjectOpener extends AOpenerLinkTag
{
  public static final RequiredTagAttribute UUID;
  private MutableTagAttribute OBJECT_ID;
  private MutableTagAttribute CLASS_ID;


  public UUIDObjectOpener() {
    OBJECT_ID = new MutableTagAttribute("objectid", "Not used -- do not set a value for this!", AttributeType.INT);
    CLASS_ID = new MutableTagAttribute("classid", "Not used -- do not set a value for this!", AttributeType.INT);
  }

  public ATagAttribute GetObjectIDAttribute()
  {
    return OBJECT_ID;
  }

  public ATagAttribute GetClassIDAttribute()
  {
    return CLASS_ID;
  }

  public static final ITagMetaData TAG;

  static
  {
    TAG = new TagMetaData("uuidobjectopener", "Opens an object based on its UUID.");
    UUID = new RequiredTagAttribute("uuid", "The UUID for the object you want to open.", AttributeType.STRING);
  }

  public HTMLElement DisplayTag()
  {
    Object[] objectAndClassId = ((IPTMigrationManager)(((IPTSession)GetEnvironment().GetUserSession()).OpenGlobalObject(PT_GLOBALOBJECTS.PT_GLOBAL_MIGRATION_MANAGER,
          false))).UUIDToObjectID(GetTagAttributeAsString(UUID));
  
OBJECT_ID.SetDefaultValue(objectAndClassId[PT_MIGRATION_OBJECT_COLS.PT_MOC_OBJECTID].toString());
    CLASS_ID.SetDefaultValue(objectAndClassId[PT_MIGRATION_OBJECT_COLS.PT_MOC_CLASSID].toString());
    return super.DisplayTag();
  }

  public ATag Create()
  {
    return new UUIDObjectOpener();
  }
}

To deploy this code, see the excellent section on edocs about creating custom Adaptive Tags.

To use this code in a portlet, do the following.

myportlet.htm:

<span xmlns:pt='http://www.plumtree.com/xmlschemas/ptui/'>
   <pt:mytaglibns.uuidobjectopener pt:uuid="{00000-0000-0000-000000}" pt:mode="2">Open My
   Object</pt:mytablibns.uuidobjectopener>
</span>

I did actually test this taglib and it worked swimmingly. Of course you need to substitute a real UUID for all those Os.

In closing, here’s a little shameless plug: I’ve been asked by BEA to give a short, 20-minute talk at BEA World on my favorite subject (duh, taglibs) at the ALUI Developer User Group on Monday, September 18th in Moscone Center, San Francisco. It will happen some time between 1 and 5:30 PM. The ALUI User Groups are free for conference attendees. I hope to see you there or at the bdg booth. Please come on up and introduce yourself — I always like to meet members of this great community in person.

Enjoy!

Comments

Comments are listed in date ascending order (oldest first)

Will there be any performance issues using this tag as it involves additional operations of getting Object ID and Class ID from the UUID?
Posted by: psudhir_it on February 6, 2007 at 10:15 PM

From what I can tell, the tag makes a single SQL query (something like select objectid, classid from ptmigration where uuid = ?) which should be a pretty darn fast query, especially since there’s probably an index on uuid.
The portal is making database calls left and right when you’re displaying a portal page, so making one more database call to generate an opener link shouldn’t really be a performance factor. Nonetheless, it’s definitely something to think about and I’m glad you brought it up.

Posted by: bucchere on February 7, 2007 at 5:53 PM

Hi Chris! Am attempting to move this over to .NET; can you tell me which reference I need to add to resolve com.plumtree.taglib.standard and the AOpenerLinkTag? I’m not sure how to convert this Java fragment, which appears to have two seperate definitions of TAG: public static final ITagMetaData TAG; static { TAG = new TagMetaData(“uuidobjectopener”, “… UUID.”); …can you tell me what it means, and any tips on converting to C# ? Should have an opportunity to throw some load at this later on; will post my results here. My customer is already sensitive to performance problems caused by header portlets making DB calls; so I will also be looking into the caching possibilities. Cheers, Rob
Posted by: rwagner on October 10, 2007 at 11:04 AM

Here is another option. The little known server.pt?uuID={XYZ-UUID} syntax. We use this in our public site which is not gatewayed to deep link into portal content without the need for an adaptive tag. We also use this to establish fqdns in apache that redirect to portal pages. For example in apache setup a fqdn of docs.bea.com which points to portal.bea.com/portal/server.pt?uuID={XYZ-UUID}.
Posted by: ryanyoder on February 11, 2008 at 6:18 AM

Wow, very cool! I totally didn’t know that syntax even existed. If it’s supported, it ought to be documented, because it’s quite handy.
One gotcha is that you need to pass mode=2 if you want to open the object in view mode because the default is edit mode, e.g.: /portal/server.pt?uuID={46514C0F-0187-4340-AA24-84E41C00C60F}&mode=2

Posted by: bucchere on February 11, 2008 at 6:31 AM

Tags bea, conferences, gotcha, how to, shameless self-promotion, taglibs

bdg dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

My Love Affair with ALI Taglibs

Post author By Christopher Bucchere
Post date September 7, 2006
No Comments on My Love Affair with ALI Taglibs

There’s been some recent activity on this very old thread in the newsgroups regarding displaying the help link in a portlet. Until G6, this could only be done with native code AFAIK. But, if you supress the portlet title bar, there really aren’t many places where you can put native code in a portlet.

Enter G6 and the extensible taglib support, a quiet little feature that (without any fanfare or marketing by BID) has seriously changed my life.

The source speaks for itself. It look 15 minutes to write. (Granted, I already had my ALUI development environment all set up.)

HelpURL.java:

package com.bdgportal.alui.taglibs;

import com.plumtree.openfoundation.util.*;
import com.plumtree.portaluiinfrastructure.tags.*;
import com.plumtree.portaluiinfrastructure.tags.metadata.*;
import com.plumtree.server.*;
import com.plumtree.xpshared.htmlelements.*;

public class HelpURL extends ATag {

public static final ITagMetaData TAG;
public static final RequiredTagAttribute PORTLET_ID;
  public static final RequiredTagAttribute ID;
  public static final OptionalTagAttribute SCOPE;

static
{
 TAG = new TagMetaData("helpurl",
   "Puts the help URL for this portlet into the variable specified by the ID attribute.");

 PORTLET_ID = new RequiredTagAttribute("portletid",
   "The portlet ID.",
   AttributeType.INT);

 ID = new RequiredTagAttribute("id",
   "The name of the variable in which the help link should be stored.",
   AttributeType.STRING);

 SCOPE = new OptionalTagAttribute("scope",
   "The scope used to store the the help link.",
   AttributeType.STRING, Scope.PORTLET_REQUEST.toString());
}

public HTMLElement DisplayTag() {
 ((IXPList)GetState().GetSharedVariable(GetTagAttributeAsString(ID),
  Scope.GetScope(GetTagAttributeAsString(SCOPE)))).Add(
     ((IPTWebService)((IPTSession)GetEnvironment().GetUserSession()).GetWebServices()
  .Open(((IPTGadget)((IPTSession)GetEnvironment().GetUserSession()).GetGadgets()
  .Open(GetTagAttributeAsInt(PORTLET_ID), false)).GetWebServiceID(), false))
  .GetProviderInfo().ReadAsString("PTC_HTTPGADGET_HELPURL"));
 return null;
}

public ATag Create() {
 return new HelpURL();
}
}

To deploy this code, see the excellent section on edocs about creating custom Adaptive Tags.

To use this code in a portlet, do the following.

myportlet.htm:

<span xmlns:pt='http://www.plumtree.com/xmlschemas/ptui/'>
	<pt:mytaglibns.helpurl pt:portletid="234" pt:id="helplink"/>
	<pt:core.html pt:tag="a" href="$helplink">Help</pt:core.html>
</span>

I didn’t test this, so YMMV. Have fun!

Comments

Comments are listed in date ascending order (oldest first)

That’s slick, Chris – that’ll be handy for porting between devstageprod where objectids may be different 🙂
Posted by: ewwhitley on September 13, 2006 at 6:20 AM

Hi, This code makes ten database requests just to get the the IPTWebService object for given portlet. Is there any better way to do this?
Posted by: Piotr Dudkiewicz on May 18, 2007 at 6:48 AM

Sorry, but there’s no better way to get the help URL out of the web service. ALUI is optimized to make calls to its database and the UI code does that everywhere — it’s a dynamic web application, so that should be expected.
Posted by: bucchere on May 29, 2007 at 2:03 PM

It seems that ALUI is optimized to do as many database calls as it’s possible;) Thanks.
Posted by: Piotr Dudkiewicz on June 1, 2007 at 2:46 AM

Tags bea, edk, enterprise software, how to, idk, java, native, portlets, sdk, taglibs

dev2dev Featured Posts Plumtree • BEA AquaLogic Interaction • Oracle WebCenter Interaction

A Slick Alternative to pt:standard.openerlink

Post author By Christopher Bucchere
Post date August 30, 2006
No Comments on A Slick Alternative to pt:standard.openerlink

For one reason or another your bosses (we all have more than one, don’t we?) have told you that the look-n-feel of the common object opener in ALUI just doesn’t cut it. Even though it’s powerful, scalable and pretty nice-looking and it includes a myriad of options (e.g search, browse, single vs. multi-select, set previously selected, etc.), they just want something different. Perhaps they don’t want a pop-up window. Perhaps they don’t like how many clicks it takes to get down to an object. Perhaps they’re just being difficult.

Regardless, you’ve been asked to come up with a clean, fast, in-place object selector that still shows a hierarchical view. (For the purposes of this discussion, I’m going to use the example of communities from here on out instead of just talking about “objects.”) So naturally, as a portlet developer, you turn to the IDK. Unfortunately, if you want to get portal metadata, you have to use the PRC/SOAP server. There goes fast. So maybe you can write it in native code or using database calls. There goes clean.

Your best bet here — and really the only good way to accomplish this — is to develop a custom taglib. Custom taglibs are quickly becoming my favorite new feature in G6. (BTW, if you aren’t on G6, upgrade ASAP — it’s worth the effort.) So, for your benefit, I decided to try my hand at writing a taglib to present a nice hierarchy of communities. Here’s what I discovered.

First off, let’s talk about the HTML I want to display in my custom tag for a moment. Whoever came up with the concept of select boxes and optgroup elements was a complete goofball. Why develop something that’s naturally suited for a hierarchy and then limit the hierarchy to a depth of one?

Here’s an example:

So I had to throw my initial idea of using nested option elements out the window simply because you can’t nest an optgroup within an option. Bummer.

So here’s the display I settled on:

There’s still a hierarchy here, it’s just flattened and there’s essentially a “breadcrumb” for each community. In the example I have, bdg is a top level community and services is a subcommunity of bdg. Consulting, development, integration and training are all subcommunities of services.

Alrighty then, so how to you construct this nice select box? And BTW, make it easy, clean and fast. Here you go:

package com.bdgportal.alui.taglib

import com.plumtree.openlog.OpenLogService;
import com.plumtree.openlog.OpenLogger;
import com.plumtree.portaluiinfrastructure.tags.*;
import com.plumtree.portaluiinfrastructure.tags.metadata.*;
import com.plumtree.xpshared.htmlelements.*;
import com.plumtree.server.*;

public class CommunitySelector extends ATag
{
   private static OpenLogger log = OpenLogService.GetLogger(
    OpenLogService.GetComponent("UI_Infrastructure"),
    "com.bdgportal.alui.taglib.CommunitySelector");

 public static final ITagMetaData TAG;
 public static final RequiredTagAttribute SELECT_ID;
 public static final RequiredTagAttribute SELECT_NAME;
 public static final OptionalTagAttribute SELECT_CLASS;
 public static final RequiredTagAttribute ROOT_FOLDER_ID;

 static
 {
  TAG = new TagMetaData("communityselector",
    "Displays a community selector.");

  SELECT_ID = new RequiredTagAttribute("id",
    "The id of the select box.",
    AttributeType.STRING);

  SELECT_NAME = new RequiredTagAttribute("name",
    "The name of the select box.",
    AttributeType.STRING);
 
  ROOT_FOLDER_ID = new RequiredTagAttribute("rootfolderid",
    "The root folder. All communities in this folder and below " +
    "will be displayed.",
    AttributeType.INT);

  SELECT_CLASS = new OptionalTagAttribute("class",
    "The CSS class of the select box.",
    AttributeType.STRING, "objectText");
 }

 public HTMLElement DisplayTag()
 {
  HTMLSelect comms = new HTMLSelect(
    GetTagAttributeAsString(SELECT_NAME),
    GetTagAttributeAsString(SELECT_ID));  

  comms.SetStyleClass(GetTagAttributeAsString(SELECT_CLASS));
  recursiveAddComms(((IPTSession)GetEnvironment().GetUserSession())
   .GetCommunities(), ((IPTSession)GetEnvironment()
   .GetUserSession()).GetAdminCatalog(), comms,
   GetTagAttributeAsInt(ROOT_FOLDER_ID), "");
 
  return comms;
 }

 public ATag Create()
 {
  return new CommunitySelector();
 }

 public TagType GetTagType() {
  return TagType.NO_BODY;
 }
 
 private void recursiveAddComms(IPTObjectManager commObjMgr,
  IPTAdminCatalog adminCatalog, HTMLSelect comms,
  int folderId, String prefix) {
 
  //CAB: add the communities at this level, if any
  IPTQueryResult commsToAdd = commObjMgr.SimpleQuery(folderId,
   PT_PROPIDS.PT_PROPID_NAME);
  for (int i = 0; i < commsToAdd.RowCount(); ++i) {
   comms.AddOption(new HTMLOption(
    Integer.toString(commsToAdd.ItemAsInt(i, PT_PROPIDS.PT_PROPID_OBJECTID)),
    prefix.substring(0, prefix.length() - 3)));
  }
 
  IPTAdminFolder adminFolder = adminCatalog.OpenAdminFolder(folderId, false);
  if (0 == adminFolder.QuerySubfoldersCount()) {
   return; //CAB: base case
  } else {
   IPTQueryResult subFolders = adminFolder.QuerySubfolders(
    PT_PROPIDS.PT_PROPID_OBJECTID + PT_PROPIDS.PT_PROPID_NAME
    + PT_PROPIDS.PT_PROPID_FOLDER_FOLDERTYPE,
    0,
    PT_PROPIDS.PT_PROPID_NAME,
    0,
    -1,
    null);
 
   //CAB: recurse into each subfolder
   for (int i = 0; i < subFolders.RowCount(); ++i) {
    recursiveAddComms(commObjMgr, adminCatalog, comms,
     subFolders.ItemAsInt(i, PT_PROPIDS.PT_PROPID_OBJECTID),
     prefix + subFolders.ItemAsString(i, PT_PROPIDS.PT_PROPID_NAME)
     + " : ");
   }
  }
 }
}

I think the code pretty much speaks for itself, but if you want further explanation, let me know by posting a comment.

Comments

Comments are listed in date ascending order (oldest first)

Iiiinteresting. This is very cool, Chris. I might be forced to highjack this and turn it into a pt:data tag 🙂 You have any thoughts on how / where you might approach caching with this? Have you seen the EOD sample tag?
Posted by: ewwhitley on August 31, 2006 at 9:41 AM

Hmmm . . . caching. It’s so fast OOTB that I didn’t think about caching it. 🙂 Plus, I used it on a project where we have fewer than 100 communities, so I didn’t have any problems. I mean, we’re not using the PRC, right? I guess if you wanted to cache it you could doink around with the shared variables in the tag library (session scope), but you’ve got to worry about clearing the cache after a fixed interval of time.
Posted by: bucchere on August 31, 2006 at 10:31 AM

Awesome! Thanks Chris. I just spent about 2 hours last night trying to do almost the exact same thing. This is great!
Posted by: jturmelle on May 17, 2007 at 7:48 AM

Tags api, edk, idk, native, prc, taglibs