Monday, July 6, 2009

Publisher and WebCenter Not Playing Well Together

Those of you who have taken the time to install and become familiar with the latest version of "ALUI" may have run into an issue with the integration of Publisher 6.5. I recently stood up the Oracle WebCenter Interaction 10gR3 suite of products on a virtual machine. For the most part, the products seemed to integrate properly with each other (Collab, Analytics, etc...).

Installing Publisher was as easy as usual but the product is not as fully integrated with the WebCenter framework. The main drawback is that Publisher is not configurable through the WebCenter Configuration Manager web-based interface. You do still need to go in and manually edit the Publisher config files on the server. This omission was most likely made due to the eventual shift to UCM (Stellent) in the next version of WebCenter. Even more annoying than dealing with the config files is the disturbing realization that the Published Content Portlets .pte server import package does not play nicely with WebCenter.

After installing Publisher, configuring the database and checking the diagnostics page (no issues reported), I tried to import the Published Content .pte file. The portal's import utility finished successfully but upon further inspection, several of the key files and publisher templates were not imported for each language. For example, the EN (English) News publisher template folder did not have an images folder whereas other language versions of the same publisher template folder did contain the folder. I tried re-importing the .pte file but was unable to get a full set of publisher files to import properly. The resulting templates were broken and the user is unable to use the out-of-the-box News, Announcement, Content Canvas, Community Directory, Header or Footer publisher portlets.

After much trial and error I found a workaround (what, you didn't think I was going to post something with out some sort of "solution" did you?). By extracting and editing the files within the published-content-portlets.zip file that is referenced by the .pte file during import, I was able to make modifications to get the import to work properly. The trick here is that I needed to eliminate languages other than English from the file. Once the files are edited and English is the only language being imported (no need to edit the .pte file), the import finishes properly for all EN portlets and all files and folders are available for the publisher templates. Now for those of you who need to support multi-language portals you may need to contact Oracle support to open a ticket for this issue. I was able to replicate this issue on a few machines (both .NET and Java portals) so I don't think it is machine specific.

If anyone has another solution they have come across to fix this issue, please post it as well. Rather than forcing you to make the language modifications to the ZIP file yourselves, I have posted my edited version here. Thankfully in release 11g of WebCenter we won't have to jump through these Publisher hoops anymore. I just hope Oracle has a good upgrade path in place to get content migrated from Publisher to UCM.

Tuesday, July 22, 2008

Dot Com Portals: Smart Searching

Most customers that purchase Aqualogic User Interaction (or Plumtree, or WebCenter or whatever the portal is known as tomorrow) decide to implement it as an intranet portal. This is the easiest way to produce a collaborative user experience and deploy the product as an employee-facing pilot internally. With the advent of ALUI 6.0 and advanced UI tweaking through adaptive tags, more and more customers are pushing the portal outside the firewall. Suntrust Bank (http://www.suntrust.com/), City Of Eugene, Oregon (http://www.eugene-or.gov/portal/server.pt) and Safeco Insurance (http://www.safeco.com/) are just some examples of public facing "dot com" websites that are powered by the Aqualogic portal.

Aqualogic does a solid job of searching content out-of-the-box but an issue quickly arises when contemplating how to properly search a public-facing dot com site with the portal's search engine. Most public-facing dot com websites employ a good amount of Publisher-based content.
Coupling structured data entry templates with well-defined presentation templates, a customer can build a large number of dot com web pages through a series of content items and published content portlets. Since the portal natively indexes Publisher content items and returns results
that link directly to the item itself, a standard search will not suffice for a traditional dot com website. The user would enter a search term, click on a result and immediately find themselves viewing a single content item outside of the framework of the website (no header, footer, LiquidSkin navigation, etc...). An alternate crawling option must be explored if we want to direct the user to the actual web page in which the content item resides.

The easiest way to go about this is to point a web crawler to the home page of the portal and crawl in all pages. This would generally be an accepted solution if site navigation didn't get in the way. All the pages would be full-text indexed, capturing all the content on each page. Unfortunately this includes ALL the content. The site's navigational links would be included in the full-text index of each page. Therefore if someone was searching for "Customer Service" on a dot com portal site and a link to the customer service page was in the navigation on every page, the user would
be presented with a result set of every page within the entire site. Most search terms would return proper results but searches run against
terms that are featured in the header, footer or other navigational pieces of the site would return vastly skewed results. How do we properly crawl a dot com portal website without crawling the navigation? I came up with the following solution while working for a recent client.

Without getting down and dirty and working on coding a very intricate custom crawler, I came up with a way to avoid navigation using the portal's standard web crawling functionality. The real key to making this work is to create an alternate experience definition that has the portal's header, footer and any LiquidSkin navigation removed. Simply copy the existing dot com website (public facing) experience definition, rename it with a "Search Experience" designation and delete the header and footer portlets. The next step is to create a new experience rule for to point users with a specific IP address range to the Search Experience definition. In setting up the rule, you should use the IP address(es) of your portal's Automation Server(s). With these two experience pieces in place, a web crawler running in a job on your automation server will crawl against the alternate experience definition and not index any text found in the portal's navigation. This method will also allow end-users to search your dot com website and click on results that take them directly to the web page in which the content resides.

The "catch" in using this alternate experience definition method is that you cannot crawl the dot com website directly from the home page. Without navigation in place, the crawler would not be able to follow the links on your site and properly crawl every page. A workaround for this issue is to create a static HTML file somewhere on your portal's imageserver. This file will just contain straight links to each community page in your site. This may seem cumbersome at first but it is fairly easy to run a SQL query against the portal database and return a list of object IDs for each page. From this list you can use search and replace in a text editor to create the proper link format. Once the HTML file is built correctly and placed in an accessible location, simply point your portal's dot com website crawler to the HTML page and have it follow one level of links. For my recent client we currently have a static list of all pages in the site, but we plan on coding an automated page in .NET that queries the database to build the list of page links on-the-fly. Such an effort is not very intensive as the SQL query is already written. Server-side processing is not an issue as well since the custom .NET page will only be accessed once a day when the crawl runs.

The end result of this alternate method of searching is a complete fully indexed repository of content that allows users to search against all pages on the website while avoiding any "throwaway" hits that would be generated by the site's navigation. If you decide to go this route in applying search to your public-facing dot com portal and have any questions, don't hesitate to ask. Additionally, if you would like the SQL query used to pull links to all the community pages within the portal, drop me an email and I can get that to you.

Happy "Smart" Searching.

Monday, July 7, 2008

The PTOBJECTCOUNTERS Table: A Lifesaver in ALUI

Throughout the life cycle of an AquaLogic User Interaction portal installation, a variety of unexpected and unwelcomed events can occur. One of the most frustrating types of these issues can occur through the deletion of an administrative object that has some unforeseen dependencies. Unfortunately, when an admin object in ALUI is removed there is no way to re-create it in the portal and have it maintain its original object id.

An example of a deleted admin object causing other issues in the portal occurred at my client site last Fall. An administrator had deleted an old Publisher portlet that was in the process of being replaced by another portlet. A few days after this portlet was deleted, users started noticing that certain links in other Publisher portlets would result in a gateway error. After some detailed troubleshooting it was determined that multiple Publisher links throughout the portal were gatewayed through the removed portlet. Since that portlet with the referenced ID did not exist, the link was failing and the gateway error was being returned. In short, we needed to restore the deleted portlet to allow the gatewayed links to function properly.

An attempt was made to export and import the object through a PTE file from the test environment portal. This method did not work as the old portlet was imported but with a different object ID. The only "supported" way of restoring the old object with the proper ID is to restore the entire portal database. This is really not an option for certain portals that feature a highly active user base where content is changing hour-by-hour. A full restore of a database from two days earlier could result in several unwanted repercussions, including:
  • Lost content that would have to be re-crawled or re-imported

  • All portlet and community preferences that had been changed in the past two days would be lost

  • Non-synched portal users created in the last 48 hours would need to be re-created.

  • A whole slew of other issues that makes this option one to avoid if at all possible

In this emergency situation I decided to utilize the rarely used PTOBJECTCOUNTERStable in the portal database. This table simply functions as a counter mechanism for objects of different classes. When a new administrative object is added to the portal, the NEXTOBJECTID field of the PTOBJECTCOUNTERS table is referenced to determine the next object ID to use for the specified object class. This table is in place to prevent object ID conflict issues throughout the portal.

To fix the Publisher issue in my client's portal, I used SQL Enterprise Manager to change the NEXTOBJECTID value of the record associated with CLASSID of 43 (the class ID for portlets in the portal) from the existing value (which I copied to re-use after the fix) to the ID number of the removed portlet. I could determine the ID of the deleted portlet object by looking at the gateway URL referenced in the Publisher links that were not functioning properly. Once the ID number was changed, I re-imported the PTE file from the test environment. The portlet object was created with the proper ID and immediately all broken Publisher links in the portal were working without issue. I then changed the NEXTOBJECTID value back to the original value to ensure that all new portlets would be assigned a unique object ID.


This is just an example of a way the PTOBJECTCOUNTERS table can allow an administrator to correct object ID issues in the portal. PTOBJECTCOUNTERS is a simple table in design but an integral table in the makeup of the ALUI portal platform and can be a lifesaver at times.