Code translation is one area where I am attempting to employ code to write code. In my case, in addition to direct code generation, I need to translate code between languages to move backwards. At least, that’s what some might call it. EATS was originally developed in PHP41from concepts proven in C++, evolved from C, originally proven in COBOL.. Then, I migrated the code base from PHP to Java, and expanded a small set of object-oriented DOM PHP4 functions, to an expanded set of true OO and Xerces XML DOM code. That is the code whose functionality is described on AF.net in the EATS Framework Design book.
These include the following source-to-source alternatives:
- Java2php – An open source project by Brian Folts, last updated in 2013, and developed to migrate Java code to PHP as a means to run a web site on hosts which do not support Java. Includes a sample conversion of a “Hello World” Java servlet conversion to PHP.
- JavaToPhp – An open source project by Devjino, last updated in 2012, and developed as “a Java source code ‘compiler’ with PHP source code output.“
- RuntimeConverter – A commercial “PHP to Java code conversion tool and a library ‘runtime’ that in theory can support all useful aspects of PHP in Java.”
In addition, there are solutions which use Java bytecode class files as input and output PHP script.
The path of EATS has been to follow the essence of an approach to the development and maintenance of sound information based architecture which is aligned with a philosophy related to the essence of sound architecture in all realms. That includes development in a direction, along a sustainable path, that includes provision for evolutionary growth; rather than revolutionary tear down and rebuild.
The most important part of all IT software is the logic framework it defines for the functional task being accomplished; not the language it is written in. Throwing out the logic in order to change the language, to some new and improved, or just different in some other form, is the primary cause of the functionality failing to perform to requirements. This is compounded when the original humans who specified and/or implemented the functional requirements of the original software are no longer available to explain the system which was automated by the original software, and the physics, social, and legislative system alignments which exist between the software system and the real world in order to be correct, efficient, and effective.
If you are writing a game, maybe it doesn’t matter. If you are writing serious code, it matters. A lot.
When you’ve built up 50 years of understanding, expressed in code which works, you translate language, but you model and repeat proven logic frameworks. Rewriting by hand, whether moving forward or backward in language and technical capability, is error prone and can be error generative. Rewriting by automated algorithm is also error prone. Done correctly, it can be capability generative and error reducing.
A language is a structured system of communication2https://www.wikiwand.com/en/Language. A computer language is a structured system of communication between humans and machines. Ultimately, all computer languages translate down to hardware languages, based on the hardware type on which the software is intended to operate. What matters is consistency in the intention of the communication, not the language in which the human author’s express that intention.
The intention of EATS, in the framing of MDA, is to move the clear expression of intention to a higher level of symbolic expression of meaning, with consistent clarity communication across a more diverse set of both humans and hardware devices. One benefit being sought is stability of expression of human intention; with expanded clarity of understanding of that intention across a wider set of affected humans; and a reduction in the need for new translations which affect meaning of accomplishment, as cyber product evolution generates constant dynamics in the lower levels of current and forthcoming cyber-physical operational processes. Simply put, we need to be able to specify cyber operations once, have them well understood by all affected people, and migrate them consistently across generations of yet un-invented new cyber technologies as we move forward into the future. Stable well understood specifications, riding on a constantly changing technical base.
The logic alignment which matters is conceptual and contextual, not necessarily physical. A major blind spot for agile development techniques that might try to emphasize quick, fast results over deep understanding of issues, and literal translation or rewrite replacement; over technical refactoring, layered architectures, and re-writing as maintenance. Building the first floor of a structure out of 2×4’s with a team of carpenters may work for a ranch style house. It’s not sound engineering for a four story building. It yields tear down, rebuild; or fall down, rebuild; and you hope for the former. The invention of new languages which allow programmers to write code faster, doesn’t enhance their ability to understand the functional problem the code is supposed to solve. Moving forward in technology is not always a forward move in function, or value.
The three code translation options defined above have overlaps and differences in approach, as well as overlaps and differences in availability and alignment with Architected Futures overall strategy and direction.
- RuntimeConverter is a commercial product, and as such it doesn’t fit with the open source objective for EATS. It is interesting with regard to the approach taken to achieve a solution which can be supported commercially3Concerns over commercial viability has more to do with industrial quality, rather than money making potential. Industrial quality is a core feature requirement for strength, fitness, durability, and sustainability over time.. It translates Java code to PHP code where the PHP code operates in a contained framework that includes a standardized library of prewritten functions to perform Java-like functions which do not translate syntactically to PHP. Coverage appears to be reasonably complete, but follows areas where commercial demand has been able to support funding development; and development seems to have stopped while Java language features have continues to evolve. Mapping of coverage to the EATS code base to be converted requires performing an actual conversion exercise, which requires purchase of the product, since the full codebase is unavailable for examination.
- Java2php is open source and seems to follow a similar approach, with a small function library, available as part of the project code, which supports basic Java servlet functionality.
- JavaToPhp is open source and attempts a more direct syntactic translation, and includes a separate open source project JavaParser, to parse the Java source code into an Abstract Syntax Tree before attempting to emit AST as PHP. The JavaParser code, which builds the AST, is a separate source package. The version included in the JavaToPhp GitHub source is out of date, but a newer version is available as an independently supported, separate, open source package. This arrangement allows the Java parsing portion of the solution to evolve separately from the code translation portion, although the two eventually still need to be functional bound. One example of that is documentation, including the following PlantUML Component Diagram made available as documentation from the parser’s GitHub site.
RuntimeConverter is commercial software, and does not fit into the EATS architecture if a functionally compliant open source exists. This allows EATS to provide a minimal cost4Cost involves TCO, total cost of operation. It includes maintenance (bug fixes and enhancements) over time. capability for general use. It also provides an avenue for self-support if, and when, the owner/authors of dependent software facilities, like this one, abandon their work.
While RuntimeConverter does not fit our objectives, there is merit to the approach of using pre-implemented library code as part of the solution. There is also merit to the AST approach.
In the short term, whichever of Java2php, or JavaToPhp, is able to achieve a clean conversion of the EATS Java code will be used as a bootstrapping mechanism. In the longer term, a merge of the two would be the most desired solution.
In the immediate, JavaToPhp is proving to be the easiest to work with, even with the old parser. Some significant reasons for this include:
- I am not trying to perform a general purpose translation. Instead, I am trying to perform an 80% translation which may cover 90% plus of the code I am concerned with. I do not need to directly execute the resulting PHP; but rather can complete the conversion by hand for modules which need the extra work.
- Translation difficulty is a function of variety produced by programmers who are free to “do their own thing” within the bounds of what the language allows. Translation difficulty is reduced when variety is reduced. Having standards, and a translator that works for that standard style of code, becomes easier to accomplish, and allows for a translation machine to cheat5To potentially work for me, and not need to work for anyone else..
To understand the approach to methods, one needs to have some understanding of context, and process. Our method is closer to transliteration, within a context, than it is to translation between contexts.
- Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text.
- Transliteration is a type of conversion of a text from one script to another that involves swapping letters (thus trans- + liter-) in predictable ways, such as Greek ⟨α⟩ → ⟨a⟩, Cyrillic ⟨д⟩ → ⟨d⟩, Greek ⟨χ⟩ → the digraph ⟨ch⟩, Armenian ⟨ն⟩ → ⟨n⟩ or Latin ⟨æ⟩ → ⟨ae⟩.
Translation retells a story through language which moves the story from one cultural context to another. Transliteration changes symbolic form, but doesn’t try to recast the story as it moves from presentation of meaning within context. It only tries to make the same story, told the same way, become more familiar for acceptance to a different audience.
The issue is, what is the context, what is the story, and how does the story carry meaning to be conveyed and understood within each context — while keeping a universal meaning with a universal context. A story about fishing, whose theme is a parent-child bonding experience, should not become an advocacy tale for aquaculture as a economic policy.
The migration from Java to PHP is both syntactic and semantic. Most translations attempted and accomplished by automated algorithms are shallow; and that is both their Achilles Heel and the source of their apparent power. This includes deep learning, AI language translations.
Translation appears at first to be the better methodology. Migrate from finished (hypothetically working) code written in software development language A, to a finished product in software development language B, which will then (hypothetically) work, and perform function in an identical manner. Theoretically understand the problem set, and provide resolution, with similar or greater effectiveness. The problem with machine translation is that it migrates an inner context without regard or understanding of an outer context, which is the real context of concern. The inner context was a means to an end. Effectiveness is not merely a function of the efficiency of migration between inner contexts; but also a function of meaningful migration and retention of component significance within the outer context. All of the way out, and up the hill. And, it’s turtles all the way up.
Computer languages are constructed languages, like Esperanto; although they have multiplied and have come to resemble pidgen languages, like pidgeon English, more and more with a plethora of geek dialects. Development methodologies for engineering and architecting software systems, cyber systems, human-machine interaction systems, were once learned, and taught, as logical, analytical frameworks. Today, in most software development, following convention is the rule, not following logical order. The result is fractured cultures of software development which are incompatible; which require more layers of fractured culture, and culture wars, to achieve functionality between incompatible systems. What was once a reasonably simple hardware-related symbol-set6ASCII-to-EBCDIC, and little to big endian. The problem has become expanded with the introduction of multi-byte character sets, but MBCS is at core, a combination of the two issues of character code conversion and bit (byte) order in the data stream. conversion issue to get two systems to communicate is a massive competing virtual container, operating system, computer language, data base and convention set translation problem. The result, is three fold:
- Core cyber systems deteriorate from lack of necessary refresh and renewal.
- New, newer, and yet newer again cyber clothing (interfaces) gets created to access core functions, used briefly, thrown away, and rebuilt as a scheme for market dominance of tool over intention.
- All, or most, of the increase in machine processing power which has been created as a result of Moore’s Law for hardware is consumed by the thrashing of bloated software; and fundamental progress toward increases in human productivity are slowly ground to a halt in a race forward which has trouble simply maintaining capability functionally delivered previously, but with less flashy technology.
The primary problem with blind faith in Moore’s Law is a failure to recognize the beginning slope of a sigmoid curve, relative to growth pattern, and assume that growth (increase as opposed to replacement) can take place as a exponential function. Logistic functions are bounded. They have real floors and ceilings. Exponential functions continue to grow infinitely. It’s debatable whether anything can grow to infinity, although the concept can be incorporated using surreal techniques. Cascaded logistic functions can be used to map technology substitutions, the essence of which can create a step ladder that is equally, or more, useful than Moore’s law in understanding achievable, patterned, growth.
Relative to cyber and Moore’s Law, there are some who think that the reported death of Moore’s Law is temporary, and that the problems which have been uncovered relative to the limits of mechanical physics will be overcome via a physical form of quantum computing. And Black Holes may be doorways to alternate universes.
Logical thinking about infinite growth has two aspects, one of which is real (time), and one of which is surreal (experience of transition through time).
- Infinite growth is consequential because any framework or plan of intention needs to identify and evolve toward existence within a concept of infinite time. So infinity must be handled as part of any growth equation. This is one problem if looking at discrete growth over discrete time. It is a different problem if attempting anunderstanding, and a working solution, which has no defined terminal horizon. Then, time must be dealt with as a continuous function which has no upper limit, and what happens at t+1, is based on existence at t, and needs to proceed to t+2, t+3, … forever. There doesn’t not need to be a t=0. Time is a continuum. Stasis is stability of capability, equilibrium, from t-n through to t+n, where n is any number. It graphs as a line parallel to the x-axis at some level of y. Growth is measured as change in experience over time, positive or negative. Acceleration in physics stops before one reaches the speed of light. Everything that’s physical has upper, and lower, limits. If you don’t know the limits, you’re lost; and your measure has no significant meaning.
- Infinite growth is not achievable for bounded entities. That is an oxymoron. Physics operates in a discrete manner; that is how it is defined, as discrete relationships which exist between discrete elements exhibiting discrete behaviors. Abstractions can be unbounded, but abstraction involves movement beyond physics to the world of the surreal.
The point of all of this is a question I first posed to my managers in the 1970’s. In 1971 I was assigned as the design engineer (Assistant Systems Analyst – Trainee), on a two person team, for a small, temporary, automation system7Customer Safekeeping (CSK) which was supposed to be built to last 6 months. The reason for the small, temporary system was that a larger system8Bank Investment Securities Accounting System (BISAS) that had been worked on for five years at the time, was not yet fully implemented and was still undergoing development. At that point, missing requirements related to missed operational processes were discovered while planning conversion and new system work flows9Similar stories played out over time. The most recent in my experience was a post-Y2K attempt to outsource a portion of a operational system that resulted in the elimination of critical cross-functional reporting which fed data to unidentified control functions elsewhere in the organization.. As we completed the six month development and installation of the temporary system specified for six months of operation, the large was finally declared undoable; and its development project was terminated. In the void that resulted, I was asked whether it was possible to enhance the small system, with a scheduled discrete lifespan, to assume additional responsibilities which were critically needed, but no longer forthcoming, that had previously been the purview of the larger, now defunct, system. I had two questions:
- What does it need to do?
- How long does it have to last?
I was given a clear answer to #1. Not so much for #2. Implementing the immediate new functional requirements, with an expanded team, in a matter of months, enabled the Bank, a Primary Dealer, to participate in the FRB’s new Book-Entry Securities System. The system10renamed to Securities Inventory Control (SIC) with a securities inventory movement and control functional responsibility for both customer and Bank owned securities. was expanded and evolved multiple times after that to a full, data center operational, overnight, batched, SMAC system interfaced to a critical time, on-site, capture and processing system. It stayed in operation for over 15 years, still running portions of the original code, and following a path established by the original architecture.
In 1976, while working on implementation of securities transaction control and financial accounting enhancements, we planned to evolve our distributed architecture (a three point architecture involving twin mini-computers on the front end, the datacenter hub, and a timeshare management information reporting component), to an consolidated, online system that would again attempt a fully integrated tool for the division’s business, as had been the intention of the previous “big system.” At the same time new regulatory compliance requirements were emerging111975 changes to the MSRB.
We asked the same questions. We got the same non-answer to question #2. Our proposal for a replacement system12STACS – Securities Trading and Clearance System wasn’t accepted. Instead, a quick conversion (front-end technology translation) of our on-site mini-computer system was advanced as the solution to the regulatory concerns13READQ, the name of the firm which developed and market the system. That cost another 5 years to achieve acceptance that the functional architecture was undoable. Our old system, the evolved to be replaced in six months solution, was then updated to address the overdue regulatory reporting requirements, updated again for another set of regulatory changes14TEFRA and finally retired in the late 1980’s.
The life of code is ultimately a discrete function and can be managed as such. As long as the life of the system, of which it is a part, is a discrete function. If the life of the system, which in the case of a cyber system, is critical to the the business which it supports; then the system, including the code, needs to be contemporaneous with the lifespan of the business. Life is discrete, but it isn’t made up of discrete periods. Cyber-physical systems that are integral to life support for the physical component need to reflect that in their architecture, and their management.
The way to answer question #2 is to ask a follow up question:
- When do you plan on going out of business?
A cornerstone of good architecture is durability of utility through intended life. EATS, and the associated Architected Futures™ methodology, including the chosen methods for software migration, is aimed at systems for which the answer to ‘when will it no longer be needed ‘ is: “Never.” Never doesn’t happen by accident. It’s intentional, or it doesn’t happen.
The difference between coding languages has more to do with the library systems provided than it does with the syntax and semantics of the language. COBOL and Fortran were supported by libraries that were invisible to the majority of people who wrote code in those languages. C was a very sparse language that was very close to machine language. What made C useful was it’s standard library of basic functions. Machine language, Assembler, was primary functional for productive development because it included a Macro language which consisted of a library of common functions. The same is true for Java and PHP and most other modern languages. The same is also true of systems developed with computer languages where the system consists of more than a few simple program modules. What made SIC successful was the CopyLib. The same is true of EATS, in both Java and PHP form.
The approach to code translation for EATSv5 is twofold:
- Translation of Java syntax instructions to PHP syntax instructions.
- Translation of the Java execution library system to a PHP execution library.
Part 1 reflects to the discussion above concerning Java2php and JavaToPhp. Part 2 reflects to the comments about the approach taken by RuntimeConverter. But it goes beyond that. RuntimeConverter is a general purpose conversion utility. As a rule, these are effective, but inefficient in terms of operation and ongoing support. They become emulators rather than true translations or conversions. Systems developed with an understanding and design of, and as, systems-of-systems, such as SIC and EATS, incorporate common systems of library components that are shared between other system components.
For the translation task there are three such libraries:
- PHP libraries, of which the Java code implementation is unaware, but which are available to support functionality in the translated code.
- Java libraries, which were used by the Java code implementation, and which are not available to the PHP code. (Unless supplied by the conversion utility, such as RuntimeConverter, above and beyond the instructions in the translated code.
- EATS library functions, which are included in the converted code.
The method we are using for our translation is:
- Drive the translation process of each module based on a Java Abstract Syntax Tree, AST, modeled on an expended and updated version of JavaToPHP.
- Incorporate into the translation an awareness of a specialized, pre-written, set of library utility functions which account for selected aspects of the contextual framing translation which does not surface in the syntactical code analysis, similar to Java2PHP and RuntimeConverter. One example of such a class is com/architectedfutures/lang/BasicEnum which serves as a helper for Enum translation to PHP.
The conversion process itself will translate the EATSv4 Java library code to form the EATSv5 PHP library. In addition, since Java is an OO language, and Java’s libraries are implemented primarily as Objects, selected portions of the Java library will be pre-implemented as PHP objects to be instantiated and used by the translated code in straightforward form. For a native PHP programmer, this will sometimes create a strange form of code, for example seeing an ArrayList or a Map being called out to perform a task where a simple array() would suffice. However, it provides a clean and simple translation and will make the code look more familiar to someone who is fundamentally a Java programmer. The primary consequence of this aspect is a slight loss of performance. However, given that PHP is an interpreted language, and has already lost a performance comparison to most compiled languages, the difference should be inconsequential. On the flip side, this create a much easier job for the translator, and the pre-coded translation can implement Maps, ArrayLists and Java arrays using inheritance from common library code using PHP arrays. Similar handling can be accomplished to types which are concerns in Java’s type checked system and which want looser implementation in PHP.