Talend Component Design Choices

Component API

The Component API is

The component API has multiple strong choices:

  1. it is declarative (through annotations) to ensure it is

    1. evolutive (it can get new fancy features without breaking old code)

    2. static as much as possible

Evolution

Being fully declarative, any new API can be added iteratively without requiring any changes to existing components.

Example (projection on beam potential evolution):

@ElementListener
public MyOutput onElement(MyInput data) {
    return ...;
}

wouldn’t be affected by the addition of the new Timer API which can be used like:

@ElementListener
public MyOutput onElement(MyInput data,
                          @Timer("my-timer") Timer timer) {
    return ...;
}

Static

UI friendly

Intent of the framework is to be able to fit java UI as well as web UI. It must be understood as colocalized and remote UI. The direct impact of that choice is to try to move as much as possible the logic to the UI side for UI related actions. Typically we want to validate a pattern, a size, …​ on the client side and not on the server side. Being static encourages this practise.

Auditable and with clear expectations

The other goal to be really static in its definition is to ensure the model will not be mutated at runtime and all the auditing and modelling can be done before, in the design phase.

Dev friendly

Being static also ensures the development can be validated as much as possible through build tools. This doesn’t replace the requirement to test the components but helps the developer to maintain its components with automated tools.

Flexible data modeling

Generic and specific

The processor API supports JsonObject as well as any custom model. Intent is to support generic component development which need to access configured "object paths" and specific components which rely on a well defined path from the input.

A generic component would look like:

@ElementListener
public MyOutput onElement(JsonObject input) {
    return ...;
}

A specific component would look like (with MyInput a POJO):

@ElementListener
public MyOutput onElement(MyInput input) {
    return ...;
}

No runtime assumption

By design the framework must run in DI (plain standalone java program) but also in Beam pipelines. It is also out of scope of the framework to handle the way the runtime serializes - if needed - the data. For that reason it is primordial to not import serialization constraint in the stack. This is why JsonObject is not an IndexedRecord from avro for instance, to not impose any implementation. Any actual serialization concern - implementation - should either be hidden in the framework runtime (= outside component developer scope) or in the runtime integration with the framework (beam integration for instance). In this context, JSON-P is a good compromise because it brings a very powerful API with very few constraints.

Isolated

The components must be able to execute even if they have conflicting libraries. For that purpose it requires to isolate their classloaders. For that purpose a component will define its dependencies based on a maven format and will always be bound to its own classloader.

REST

Consumable model

The definition payload is as flat as possible and strongly typed to ensure it can be manipulated by consumers. This way the consumers can add/remove fields with just some mapping rules and don’t require any abstract tree handling.

The execution (runtime) configuration is the concatenation of a few framework metadata (only the version actually) and a key/value model of the instance of the configuration based on the definition properties paths for the keys. This enables the consumers to maintain and work with the keys/values up to their need.

The framework not being responsible for any persistence it is crucial to ensure consumers can handle it from end to end which includes the ability to search for values (update a machine, update a port etc…​) and keys (new encryption rule on key certificate for instance).

Talend component is a metamodel provider (to build forms) and runtime execution platform (take a configuration instance and use it volatively to execute a component logic). This implies it can’t own the data more than defining the contract it has for these two endpoints and must let the consumers handle the data lifecycle (creation, encryption, deletion, …​.).

Execution with streaming

A new mime type called talend/stream is introduced to define a streaming format.

It basically matches a JSON object per line:

{"key1":"value1"}
{"key2":"value2"}
{"key1":"value11"}
{"key1":"value111"}
{"key2":"value2"}

Fixed set of icons

Icons (@Icon) are based on a fixed set. Even if a custom icon is usable this is without any guarantee. This comes from the fact components can be used in any environment and require a kind of uniform look which can’t be guaranteed outside the UI itself so defining only keys is the best way to communicate this information.

when you exactly know how you will deploy your component (ie in the Studio) then you can use `@Icon(value = CUSTOM, custom = "…​") to use a custom icon file.
Scroll to top