Author: mikaelsand

How to only poll data on specific weekdays using the WCF-SQL adapter

There are a lot of solutions to this particular question. The need is that we only poll data from a database on Sundays.

This might be solved using a stored procedure that only returns data on Sundays. It might also be solved by using the famous schedule task adapter to schedule the poll for Sundays. You can also do some cool coding thing using a custom pipeline component that rejects data on all other days but Sundays. Your scenario might be very well suited for one of these solutions, the scenario presented by my colleague Henrik Wallenberg did not fit any of those.

The scenario

A database is continuously updated thru out the week but we need the export data from a specific table every Sunday at 6pm. We cannot use the schedule task adapter nor stored procedures. We decided to try to trick BizTalk using the PolledDataAvailableStatement in the WCF-SQL adapter on a receive port. Turns out it works! Here is how.

Please note that this does not work if you cannot use ambient transactions.

According to this post, you must set Use Ambient Transaction = true if you need to us a polledDataAvailableStatement. This seems really odd to me but after receiving feedback about this article I know that it is true.

The solution

  1. Create the receive location and polling statement.
  2. Find the setting PolledDataAvailableStatement
  3. Set it to: SELECT CASE WHEN DATEPART(DW, GETDATE()) = 1 THEN 1 ELSE 0 END
  4. Set the polling interval to 3600 (once an hour).
  5. Apply your settings.
  6. Set the Service Window to only enable the receive location between 6pm and 6:30 pm. 
  7. Now the receive location will only poll once a day and only execute the polling statement on Sundays.

More information

How does this work? It is very simple really. The property PolledDataAvailableStatement (more info here) needs to return a resultset (aka a SELECT). The top leftmost, first if you will, cell of this resultset must be a number. If a positive number is returned, then the pollingstatement will be executed, otherwise not.

The SQL statement uses a SQL built-in function called DATEPART with a parameter value of “dw”, which returns “Day Of Week”. More information here.
Day 1 is by default in SQL Server a Sunday, because Americans treat days and dates in a very awkward way. There might be some tweaking to your statement in order to make Sunday the 7th day of the week. So the statement SELECT CASE WHEN DATEPART(DW, GETDATE()) = 1 THEN '1' ELSE '0' END returns a 1 if it is day 1 (Sunday). This means that the pollingstatement will only be executed of Sundays.

We then set the pollinginterval to only execute once an hour. This, together with the service window, will make sure the statement only executes once a day (at 6pm) as the receive location is not enabled the next hour (7pm). You could update the SQL statement to take the hour of the day into consideration as well but I think it is better to not even execute the statement.

The downside

This is not a very reliable solution though. What if the database was unavailable that one time during the week when data is transported? Then you have to either wait for next week or manually update the PolledDataAvailableStatement to return a 1, make sure the data is transported and then reset the PolledDataAvailableStatement again.

In conclusion

It is a very particular scenario in which this solution is viable and even then it needs to be checked every week. Perhaps you should consider another solution. Thanks to Henrik for making my idea a reality and testing it out. If you want to test it out for yourself, some resources to help you can be found here: InstallApp Script

Flatfiles and repeating tag identifiers

This particular post might not be very “edgy” but rather something to get me going, as well as functioning as a personal “how the h-ll did I do that” for the future.

The problem

A client presented us with a rather standard flatfile. It was about 20 rows long, separated by the ever usual CR LF, and they contained tag identifiers.

The identifiers were numbered 10, 20, 30 and 40 and they could appear in any order, may or may not be present and the records were positional.

Seems easy, right? Well it took me some time and there are a couple of gotchas along the way so I thought I would share them.

The file

Here is a subsection of the file. Notice the repeating identifiers and the difference in length between identical tags.

10000000000004083006019800324E50000John Doe
2500000000000000433       Q00086Jane doe
3000000000000008448H00001Joe Doe
4000000000000008448H00001Hanna Montana
10000000000004083006019800324E50000

The solution

There were three issues that needed to be solved and, again I am not presenting anything new here.

  1. Tag identifiers in positional records.
  2. Repeating tag identifiers.
  3. The last column could be empty or not padded.

I started by adding a flat file schema, adding a root, and setting the Child delimiter Type to Hexadecimal and Child Delimiter to 0xD 0xA, now for the interesting part.

Tag identifiers in positional records

Tag identifiers are simply put, just that: some kind of text at the start of a record that shows what kind of record it is. It is a very useful feature and is often used in mainframe scenarios.

The tag identifiers in the file above are the first two positions. So I added a new child record and set the property Tag identifier to 10, and the Structure to Positional.

I then proceeded to add all the different columns based of the relative length and position.

Moving on I did the same for the other tag-types as well and ended up with a schema looking quite simple and straightforward.

This is when I ran into my first error: The tag is not removed from the row of data when it is used. So all the other positions was offset by two positions and the parser could not find any columns, or rather the rows were too long.

In order to make it work you have to offset the first field by the length of your tag identifier, or add a column at the front to handle the identifier. I opted for the first one and updated the property “Positional Offset” to “2” for the first field in every record.

Repeating identifiers

You may, or may not know this, but XML actually infer some kind of tag order when writing data defined from an XSD. That is why you get an error when you skip a tag that is not MinOccurs: 0.

So how do you handle the fact that some records might repeat later in a file, like the one in question? The answer is to put all the root children in a Choice-node. So I right-clicked the root node and selected to add a Choice Group. I set the property Max Occurs to Unbounded for the choice node, as well as all the other child records. For those I also added Min Occurs: 0

Lastly I added all the child nodes to the Choice Node and now the row types (tags) may be in any order and may occur zero or unbounded times.

The last column may be empty

One very popular way to export data from a mainframe is to make all the rows be of the same length. One very popular length is 80 characters. However, many exporters choose to end the rows when the data ends. So rather then putting in enough white space to reach a total length of 80, a CR LF is inserted and the row is ended.

You might think that simply setting the tag to “Min Occurs: 0” might solve this but you, like I was, would be wrong. The concept is called “Early Termination” and once you know that it is easy.

In order to allow early termination, which is set to false by default, I had to enable it. The property was located at Schema level and is called Allow Early Termination. Once I set that to True, everything worked fine.

Note that if you do not set Min Occurs: 0, the tag will be created but set to null if the row terminates early.

Data: "...Mikael" = <NAME>Mikael</NAME>
 
Data: "..." = <NAME />

In conclusion

The fairly standard way of exporting data from a mainframe to a textfile, should from now  on not really be a problem.