Sunday, January 29, 2012

Managing files, and Google Chrome Bookmarks

Last year I moved over to using Google Chrome. Various mathematics and graphics sites I was directed to on twitter, required IE9, Chrome, or Firefox. Since IE9 requires an update of the operating system, that left the choice between Chrome and Firefox. Since I use various Google tools and they seemed faster in Chrome, I opted for Chrome. I also was experimenting with exporting my Personal brain file for a web site, and it also only displayed correctly in Chrome and Firefox: all the link lines between thoughts are missing in IE8. Firefox, may have been the option, since encountered a problem with Chrome when visiting ExcelCalcs, with site menus disappearing behind video images displayed on opening page: but this appears to have been fixed.

Files and Folders
Anycase, I don't like either Chrome or Firefox bookmark files. Shortcut files waste hard disk space: small file using up addressable block larger than file needs. However shortcut files can be arranged anywhere on the hard disk, to create task specific folders, or associated with project folders, totally independent of the browser. When I do internet research, I tend to accumulate hundreds of files and shortcuts. At the time of acquiring I may or may not sort the files and shortcuts into some rational collection of topics. So barring time consuming task of sorting into topics, I have VBScripts (.vbs) which are used to sort files into folders based on: year, month and file extension. Some scripts create folders locally, others move files from one folder structure into another. Once sorted into similar chronological folder structures it then becomes easy to use Beyond Compare, to eliminate duplicates. I've tried duplicate file finders they just duplicate the mess: take far too long, and then require a considerable amount of time to manually adjust: which are the originals and which are the duplicates. More over I don't actually want to delete duplicates, I do want a chronological archive, showing historical development of the various things I work on. So custom scripts became the way to go: though the scripts at the moment just creating an alternative mess: there are some situations where need to keep different file types together. So to this end been experimenting with FlexTek/Diskboss and xyplorer: with these applications I can search, exclude folders, and otherwise move files and auto-increment filenames if duplicates involved.

Archiving internet shortcuts in MS Access.
So I just got into regularly sorting files and shortcuts into : year and month, when moved over to Chrome: no shortcuts: no direct means of programmatically sorting my bookmarks: a step backwards. A big step backwards, because also have Excel/VBA and MS Access/VBA macros to read internet shortcuts, extract the information and then archive in an MS Access table. The table contains keyword fields, and other fields for grouping. Using query in MS Access I can delete duplicate shortcuts, and otherwise search and find stuff that I need. Admittedly not altogether helpful, since websites disappear or get modified, and the weblinks become invalid. However even Google returns invalid links, so it saves sometime by searching my database get the name of the webpage and then search google.

That I archive all my internet shortcuts in MS Access tables, and delete the shortcuts from my hard disk, did suggest that the Firefox bookmark manager was a possible alternative. But after a few days of messing with Firefox bookmark manager concluded it was too slow and unstable. Also experimented with Personal Brain, imported all my bookmarks: but I have no means of programmatically sorting, tagging and linking. Also cannot get used to using personal brain as a starting point and launcher. Consequently MS Access remains the centre of everything: I can build tables cataloging files, archive internet shortcuts, and otherwise link together with project data, and display data in whatever format I want using VBA.

So either way bookmarks have to be extracted from either the Chrome or Firefox bookmark.html files. At first I didn't think this was a problem: I had previously written several years back, a simple Excel/vba macro to rip url references from html files. The purpose was to extract weblinks from saved webpages, containing little more than hyperlinks. Put the hyperlinks into Access table, then delete the saved webpage and associated files. At one stage I saved webpages, but discovered couldn't back up to a CD, since CD's didn't support the long file names: nor my deep folder structures. So got rid of all the saved web pages, after extracted useful urls, and otherwise printed to pdf. Now instead of saving webpages, I just print web research to pdfactory.

Messing with the Google Chrome Bookmark file
So my existing vba routine seemingly little use. I tried opening the bookmark file in Excel directly: it turned out not to be a properly formatted html file. I also tried opening in XML Notepad and UltraEdit, but also turned out not to be a properly formatted xml file. So I started writing a limit state machine parser1, similar to techniques for parsing AutoCAD DXF files, that was early last year, and I otherwise got side tracked from. But its a new year and time to clean up last years accumulation of junk. I'm a womble I tend not to throw anything away: near gaurantee that within a few days of doing so, will end up needing.

Early last year I did do a google search, and find code for parsing html, and xml, but as indicated the bookmark file is not fully compliant with, I aslo did find various projects for parsing the bookmark file, but all were based on programming languages which support searching using regular expressions, and otherwise exporting the data to xml format. So I did another search this year and discovered that the file format is json, for which there are a variety of parsers in a multitude of programming languages: once again mostly using regular expressions and converting to xml.

Down loading code for a json parser, and then customising specifically for extracting the data from the bookmark.html file all seemed too complicated. I orginally considered writing a parser because, no guarantee on the location of data in the file, therefore better to find it by properly parsing the structured file format. However, it appears that each bookmark is written to a single line in the file, therefore I only need to chop up an individual string: I don't have to scan across multiple lines and find opening and closing tags. Therefore last night, discarded limit state parser idea, and went back to original brute force method. Read a line, if string contains an url, then chop it up and extract the various pieces of information, and store in Excel worksheet. From Excel I can then transfer to MS Access. I typically program in Excel/vba since that is where all my additional functions and macros are for string handling, and file handling: also programming in MS Access is a bit more cumbersome as is working with tables {especially creating and emptying  tables}.

Being a brute force method, the macro is likely to come unstuck at some point in the future, but at this point in time it meets my needs. Well with one exception: Google Chrome bookmarks file contains ADD_DATE, and LAST_MODIFIED fields, but does not show this information in the bookmark manager. The LAST_MODIFIED relates to folders, and I am currently ignoring the folders I have in Google, but ADD_DATE relates to the bookmark: it is an integer type date value, but is not the same as date values in Excel/vba. So current task is to find out how to convert the value into a Excel/vba compatible date value.

Google Chrome API
I did find the google chrome api, and it appears can access the bookmarks object directly, and otherwise build own bookmark manager to work in chrome. However being built around a browser, I am unsure whether it is permitted to write the data to a local file. JScript run locally, through windows scripting host can read and write files on local hard disk: this is thus something to look into at a future date.

On High Level Programming Languages and Application Languages
The first programming language I learnt was Fortran 77 at uni. At school we had filled various cards in and sent them off, the langauges used being BASIC and APL: basically these were random one lesson exercises, and of little real value. Well had one benefit I guess, if want to solve some matrix problems, then using a computers going to take a week or more to get results back. Can actually solve the problems in less time than takes to fill in the stupid cards. At uni moved away from cards, but otherwise had to fill in programming sheets, and hand into data processing. Once typed up, could then edit and modify the program as necessary to get it running. For which purpose, the VMS operating system was described in a completly mysterious and unhelpful manner, then fighting to get access to a computer. Not much fun, and little learning.

At home we went and got a Microbee computer in a book, running CPM/80 with a single 3.5" floppy drive, which we eventualy extended to 2 drives: one for data, and one for applications. I messed with the basic programming langauge included for a short time, and then we got Turbo Pascal 3 {unfortunately on 5 1/4" floppy}. When moved over to MS DOS, eventually moved up to Turbo Pascal 5, but also started programming with Turbo C. I also messed with the Lotus 123 macro langauge (and aseasy as), DBASE/Foxbase application language, WordStar indexing and formatting codes, and paradox application language, and AutoLISP, and submit and batch file programming.

I like object Pascal, but prefer the syntax of C, and especially prefer C for processing strings and crunching numbers. But I get lost in C when it comes to pointers, and find data structures unreadable and extremely cumbersome syntax. When moved over to windows I got Delphi and C++ Builder, along with Quatro Pro for windows and Paradox for Windows. When it came to objects I preferred Delphi over C++. However I ended up building more applications in QPro than Delphi. I reached a stage where I wanted to extend the functions in QPro rather than having to ensure macros would run. I accidently found a book in Office programming, whilst wandering around book store in lunch time: bought it learnt about the vba language, then went and bought office 97 system developers kit. I then started writing more Excel/vba code. At one stage the various Australian standards changed, and that meant that a program I was developing in Delphi became obsolete, along with my Excel function libraries. Since use Excel as primary calculating tool it was more important to update my Excel/vba function library than the Delphi library. The result was I then translated the Delphi program to vba . Another reason for so doing was, that I had attempted to import the Excel type library into Delphi, so that I could use COM automation from Delphi, but the type library had too many conflicting keywords. Don't need the type library, but Pascal uses different brackets for functions and arrays whilst vba doesn't. So problematic programming Excel from Delphi, since do not know whether an object is referencing an array or function.

So ended up spending as much time in the Excel vba editor (vbe), as I do using the Excel worksheets. I have visual studio 2003 and 2005, the idea was to program in VB, to save time in converting vba to Delphi for stand alone programs. But VB is a different language again, and so the vba code still requires translating and debugging.

So this has tended to put me in a quandry. Most windows scripting host (WSH) scripts I write using VBscript, though occassionally I use JScript. For websites development I originally used JavaScript, but moved over to VBScript. The prime reason being the existing libraries I have already in vba.

However the other year I got a netbook, for which I basically have no software, other than any freeware and opensource stuff that I can find. It came installed with demo of Office 2007, I didn't like the new interface, so didn't buy, and uninstalled. I have openoffice installed instead. The problem with openoffice is that is uses a different dialect of basic again, and also doesn't appear to directly support COM automation. I use Excel/vba to control other applications through COM automation: intelliCAD, designCAD, turboCAD, MultiFrame, MS Access. {By the way I discovered designCAD because the, COM object is still accessible in instant shed designer, which I was taking a look at, to see if any use. Which raises a question: how can they sell something for less than the application it is built on? Milo Mindbinder.}

So which direction to move. The software companies seem to think turning a perfectly good screwdriver into a hammer is an upgrade. I recently built a spreadsheet for a client, using Excel 2003, they saved it in, 2007/2010 I'm not sure which, but I cannot open it with the file converters. Given I have around 200, possible 1000 unique spreadsheets, I don't want to mess around converting file formats: been there done that with QPro to Excel: I still have QPro installed to check the original files occassionally. {Excel 97 could convert QPro files Excel 2003 cannot: not to mention windows classifies the files as a hazard}.

I don't see how changing the interface, can be considered as being the same product: it becomes a completely different tool. In Lotus 123, "aseasy as" and QPro for DOS, my fingers flew across the keyboard. Then QPro for windows introduced the property inspector, and the mouse became important. Little by little loosing keyboard skills and becoming dependent on a mouse. I keep trying to get them back, but using the mouse is habit forming, and difficult to break when trying to get a job done.

Anycase I spend a lot of time programming, because I expect the computer to do the work not me. Which was first backward step for windows removed the batch processing capabilities. Now there is windows scripting house (WSH), and windows can be programmed using batch (.bat), VBScript (.vbs), and JScript (.js). There are macro recorders around, but they are all generally context sensitive, and screen based: not command.

I don't really like windows, but most software I use is windows dependent. I always wanted a Unix machine, but wow, System V required a massive 20 Mbytes for a full install: the largest hard disk on a PC was only that big, so no space for data. Now there are people struggling to create small Linux installs of 40 to 100 Mbytes, though most of these include a graphical windows interface. The original core of the internet is built on Unix, and unix is built around C programming langauge (its scripting language is C like), whilst most web browsers use JavaScript which is a C like language. I'm not sure but I think openoffice supports JavaScript.

So contemplating moving over to using Ubuntu, Java and JavaScript. As it is I have Ubuntu installed on a stick for fixing Windows when it crashes. Problem with Ubuntu is compatibility with clients, but how compatible do I need to be? Most clients barely able to use a computer, but do have windows, and make use of Excel. But recent projects have identified that Excel based applications not really appropriate for their needs. I mean 20 Mbyte, Excel files are not really efficient, when could write Turbo Pascal program of around 250Kbytes or so. Plenty of people who can build Excel applications, and mostly can be built on top of current manual processes, thus retaining some transparency. But such applications can become cumbersome, and demand more computer resources than necessary, and not altogether user friendly. Not the least of which is many of them would be better built around MS Access, which is far better for storing and sorting data. Don't necessarily need MS Access, can connect to mdf files using DAO or ADO objects from with in Excel/vba or other vba programming environment. That I guess is the deciding factor COM automation and/or the .NET framework. There have been remours that vba will be removed from Office, it has already been removed from AutoCAD. That is not a problem, I already program designCAD and Multiframe external to the applications, using Excel/vba, because there is no vba editor, and no programming environment built in. The applications are just COM automation servers. Some people not happy AutoCAD lost vbe, but AutoCAD was always programmed in AutoLISP using an external text editor, so not really a major issue. As such I avoid AutoLISP and use other programming langauges to generate script files (.scr) or otherwise parse DXF files. If I loose Excel/vba, then vba no longer important, and I can move over to other langauge like C# for example. Its a matter of finding the right language to deal with objects, and avoiding problems with pointer arithmetic. Also mention that I do not like vba approach to classes, and its creation and referencing of objects can be more of an hassle than working with pointers. In a nutshell I swap programming languages to suit my needs at a given time.

The primary programming task is bashing files, translating one data format to another, simply extracting information, or automating file generation or program execution. Most number crunching I do directly in the Excel worksheet. At one time used QEdit, and then UltraEdit for manipulating files, but now more often than not I use string fomula in Excel worksheet, either to generate data input files, or otherwise modify. Even if I ultimately write a vba macro, I typically test searching strings and cut them up, using Excel cell functions: then rewrite using equivalent vba functions.

However I do use vba functions a lot, and also use vba to generate tables of data. I don't like complex cell formulae, they are difficult to read, and I don't agree with MathCAD about text book like formula, and MathCAD diverges away from mathematical notation in anycase. Spreadsheets have the advantage that it is not necessary to name variables, already working within an indexed matrix. So no problem differentiating sigma used for stress, from sigma used for standard deviation: a problem if doing statistics on stresses. Australian standards are dependent on many test conditions to select one formula from another: I find such conditional checks read better when written in vba rather than in a cell formula. Secondly whilst Excel can generate tables of data, its not very efficient for the formula is repeated in every cell. By using a vba loop, its is more likely that every cell is calculated based on the same formula and conditional checks. The problem with that is that changing a parameter doesn't automatically update the calculations. Circular references also a problem in Excel. Whilst circular references can be used to deliberately force iteration and finding a solution to a calculation sequence, Excel flags circular references as an Error: I therefore don't like that approach. Further it starts getting complicated, if want to generate a table of values: or want automatic calculation based on other values. Like recently for balustrades I wanted to calculated maximum span of the top rail based on a range of conditions. For one set of conditions I could simply rearrange the equations and calculate directly, for another I ended up with a quadratic, which I could solve in the worksheet cells and get a direct answer, but for other conditions: no simple solution. So ended up repeating all that was calculated in Excel worksheet in vba: so that I could iteratively find a solution and return a simple value to the main body of the Excel formatted report. Part of the problem was that the limiting deflection for the span is dependent on the span, larger spans permitted larger deflections. In first instance simple expression, simply substitute simple span ratio for span, into expression for deflection and rearrange, and get answer directly. However I had simplified the expression for the deflection limit, when replaced with actual, it got complicated: hence write iterative function. Before doing this however I simpply used Excels goalseek capability to find the answers: that however did not fit in with what I wanted to calculate and where I wanted it to fit in with rest of calculations.

In-House Software Development
Such situations is why adopted a policy of writing own engineering software, rather than buying commercial. I'm not going to write own version of Multiframe or AutoCAD: not yet any way. I say not yet, because we have an in-house 2D frame analysis package, written in Turbo Pascal, and currently being transferred over to Excel/vba. Also it is apparent, in writing my own programs to generate drawings that I have all the data structures necessary to write  a CAD package: just don't have the graphics, printing or an editor. In translating the 2D frame analysis to Excel/vba, have used the low cost CAD package designCAD 3D to draw the frame and moment diagrams. But it is becoming apparent that could use the shapes layer in Excel to draw. This would be good. I think I mentioned before that MuliFrame and AutoCAD are far too expensive to use as analysis and graphics engines for point-of-sale design software. Given that most suppliers with software, have software built around clumsy spreadsheets, it seems there is opportunity to build a more appropriate application around the full capabilities of Excel/vba. But would it be better to build around openoffice, noting that openoffice is available for both Windows and Linux?  It would be beneficial if openoffice doesn't keep getting automatically updated, and thus applications built around remain static. I don't want to be constantly fixing bugs caused by office updates. This what pushes in favour of higher level programming languages and building stand alone applications. The problem with stand alone applications is a lot of work is required on the user interface: traditional practice places at some 80% of the effort. By building on an existing application, development of user interface reduced: no file formats, no reading/writing file problems, no editor development, no screen display issues, and no printing issues. The developer of the primary application has resolved all those issues. Additionally even if write a stand alone program, its likely dependent on a multitude of windows resources provided by the API, so if windows updates then the program may become unstable. So whilst windows has resolved many of the hardware compatibility issues: it has largely replaced them with software problems.

The other issue is should such applications be internet aware and function over the internet, and also if they can function over the internet should they also operate on mobile phones? This pushes towards using Java and JavaScript as programming language. On the otherhand the .net framework, is supposed to be about it not mattering which language program in: source code in many different languages can be used in the one application. It is something of an alternative to the Java virtual machine. Which is another issue programs compiled using Visual Studio are not exactly stand alone programs, and still relatively large compared to other compilers, despite all the external resources dependent on. Picking the best programming language not so easy, so don't choose, just use all, as the needs determine.

As mentioned have a policy of writing in-house software, by doing so keep upto date with changes in Australian standards, don't have to wait for commercial software writers to update and release new versions. More importantly however is the freedom to format own reports, to carry out calculations in sequence desired and use the results for other purposes. Everything written becomes a building block towards working with larger and larger systems in less and less time.

On one hand engineering is about numbers rather than mathematics, on the other it is more about theory than calculations. For example engineering is not about substiting numbers in an expresssion like M=wL^2/8, it is more about the infromation such expression gives me about the physical world and the choices I can make. With commercial software plug numbers in, then get size of a beam returned. In general software typically checks compliance with codes of practice, alternatively may either return a single option or a list of several possible options in terms of selecting a suitable component part. As a result the software typically requires the use of trial and error for anything else. For example pick a component, say structural section, check if it complies, if it doesn't then pick another and check again. But how to select the first choice, and it having been rejected, which direction to go for the next choice? Its the theory which gives the direction. With the theory it may be possible to calculate directly for the problem at hand, whilst the available commercial software may only be able to get there through trial and error.

Further more these decisions typically need to be made by persons who are just aiming to get on with building things, it becomes a major delay and frustration for these people, when they have to wait on others to crunch some numbers and make design decisions {some days words look weird and foreign}. Manufacturing is moving towards making design more accessible to the population at large, so that manufacturers can make product better suited to the needs of the end-user. Manufacturing wants to get much like construction industry and have zero inventory, making to order. However, 5 years of research, design and development (RD&D) to get a product to market, doesn't fit well with this objective, nor does a need to build and test prototypes. The building industry can unleash bridges, and massive buildings onto the world as real world experiments, placing many lifes at risk based mere compliance with codes of practice, which are themselves based on inexact and incomplete scientific research. Despite civil and structural engineers, the building and construction industry is based on some relatively inexact science: passed of all to easily as they know what they are doing. For main stream buildings not altogether an issue, plenty of historical repetition, for the novel and innovative, then getting a lot closer to the limits of where the theories have been tested. The codes of practice are some what lacking in appropriate constraints, and demanding prototype testing.

Commercial Software for Non-Engineers (Structural)
Anycase in the past people just made stuff for themselves and accepted the risks of doing so. In the modern world there is a system of exchange, and an inference of supply be specialists, and consequently extremely high demands on expected performance as opposed to specified performance. Its not good enough to simply state what a product is fit for, many disclaimers have to be given with respect to what it is not fit for. Then there is also the issue of how fit is fit enough, and is it adequate? Everytime some person has an accident, the performance criteria are raised across the board. Whilst I think it is necessary to design, better and safer products, I also think individuals should take greater responsibility for their choice and manner of use of such products, not always seeking compensation. Their irresponsible behaviour places inconveniences on everybody else: eg. smoke alarms, RCD's, bicycle helmets, guards on exercise bikes to name a few. Whilst these things provide protection in the event of a problem: the problem is not rampant, and these safety devices some what magnify the real cause of the problem. For example why was a baby crawling around all over the floor sticking its fingers in everything, whilst the mother riding an exercise bike: and how did the child get so close to the exercise bike, to have fingers amputated. Do need an exercise bike just chase the kid around and keep it out of danger, teaching it about the dangers. But then the parents would have to know the dangers. Similarly houses don't just burn down, there has to be a cause, a source of fire: often someone smoking in bed, or electrical devices left on. Not everyone cuts through power cables with a power saw. As for falling off bicycles, most common injury is broken collar bone: people fall sideways: hitting head against floor not all that common. Sure it is recommended to get these things and use them. The objection is making these things mandatory, and forcing unecessary upgrades in short time frames or inappropriate times.

So allowing end-users to make parametric changes to a product, via the use of software is more involved than simply crunching numbers. The software already in the market is appalling, the number crunching may be valid but the reports output are near useless, and the software has inadequate built in restraints and error checking, plus may not be relevant what so ever to the product desired.

The main culprit so far is software used by the nail plated timber truss compaines. The software is proprietory, and only agents supplying trusses have access to the software. The reports traditionally provided inadequate information regarding input parameters and decisions made regarding the selections specified. For example no evidence that is it feasible to form a given nail plated connection. South Australia now has a ministers specification, identifying requirements for such structural software, not otherwise used by structural engineers. Part of the requirement is training in the use of the software, and appointment of person responsible.

This should also apply to retained standard calculations, for much of my work results from variations from standard calculations. Some days it seems that manufacturers of structural products, they have all forms of structure available, simply pick up which calculations are at hand and submit to council for approval. If I am to believe the manufacturers then 10% of the time there are issues. From their rejected applications, and the standards calc's I may get to see, I hazard that 10% only represents those issues caught by the council, and that much is getting approved when it should be raising questions. There is also a great much that gets built without approval.

So have software in market in need of improvement, standard calculations to be replaced by software, and owner builder stuff to be checked using DIY software. But the software really needs to be thought about and properly designed, not just launched into and written as currently appears to be the case. This launching into is the case because mostly considered to be simply Excel spreadsheets. Though also aware of some extremely expensive stand alone software which was not upto its specified task: operating at point-of-sale. Its not just a matter of structural engineering and computer programming knowledge: there are other issues to consider. In particular there can be no required training for its use, the typical person in the street should be able to use the software without any special training. It should produce reports useable to the end-user and also reports suitable for council approval. The data files should also be suitable for use in other software of greater use to certifiers. A certifier may want to check bending moments and stresses, whilst the DIY is not so interested in. Thus have alternative versions of the software.Which was the problem with truss software, the only way to check was with general purpose frame analysis software, and that takes the certifying engineer considerably longer to build a model, than it takes the timber estimator using specialist software. There are two basi requirements: 1) Specification-of-intent 2) Evidence-of-suitability. Currently available software used by non-engineers, only meets the specification requirements, the evidence-of-suitability is inadequate and reliant on an assumption or questionable certification that the software makes a valid assessment. The new ministers specification attempts to make the certification more reliant and resolve other issues.

Software is written to speed things up. Can just send code direct to CNC machine tools and produce finished product. But is such auto-produced part fit-for-function, is there any evidence of its suitability of purpose?

But its not just software used by non-engineers that should be of concern. Software used by engineers is also questionable, in its fitness-for-function, and most such software contains many long disclaimers. The computer reports generated are very close to being a mass of scrap paper: which typically no one has ever looked at. So if the designer didn't use it, what use does the designer think the certifier will get from it? More thought needs to go into the whole design and assessment, approval process to make it more quality robust. It is likley that information technology and knowledge engineering will play a great part in this process.

{well thats my divergent waffle over for today: Sat 2012-Jan-28  23:04}

  1. Dennis N Jump, 1989, AutoCAD Programming, TAB Books

  1. Original

PS: I have now set up another blog over on wordpress. The current blog here will continue with the chaotic free writing. As I get time to more formally structure, and rewrite as a more polished article I will post the modified article over on wordpress. I still won't think it is finished, but hopefully will have a more focused flow, and start new ideas in new sections, instead of a divergence from current flow.

PPS: Wasn't aware that there were actual limits on free writing, such as time limits or otherwise page limits or word limits. Such aspect of free writing is based on an assumption of difficulty finding something to write, rather than a mass of divergent ideas interfering with clear thought. I don't have a time limit, the constraint is interruption for meals, or simply the number of convergent ideas gets way beyond that which I can write about, and I just stop writing and keep on thinking: possibly talking to myself. To that end it is only partially successful at clearing my mind.