Talend Component Kit Developer Reference Guide :: Talend Component Documentation

Talend Components Definitions Documentation

Component definition

Talend Component Kit framework relies on several primitive components.

All components can use @PostConstruct and @PreDestroy annotations to initialize or release some underlying resource at the beginning and the end of a processing.

In distributed environments, class constructor are called on cluster manager node. Methods annotated with @PostConstruct and @PreDestroy are called on worker nodes. Thus, partition plan computation and pipeline tasks are performed on different nodes.

deployment diagram

The created task is a JAR file containing class information, which describes the pipeline (flow) that should be processed in cluster.
During the partition plan computation step, the pipeline is analyzed and split into stages. The cluster manager node instantiates mappers/processors, gets estimated data size using mappers, and splits created mappers according to the estimated data size.
All instances are then serialized and sent to the worker node.
Serialized instances are received and deserialized. Methods annotated with @PostConstruct are called. After that, pipeline execution starts. The @BeforeGroup annotated method of the processor is called before processing the first element in chunk.
After processing the number of records estimated as chunk size, the @AfterGroup annotated method of the processor is called. Chunk size is calculated depending on the environment the pipeline is processed by. Once the pipeline is processed, methods annotated with @PreDestroy are called.

All the methods managed by the framework must be public. Private methods are ignored.

driver processing workflow

worker processing workflow

The framework is designed to be as declarative as possible but also to stay extensible by not using fixed interfaces or method signatures. This allows to incrementally add new features of the underlying implementations.

PartitionMapper

A PartitionMapper is a component able to split itself to make the execution more efficient.

This concept is borrowed from big data and useful in this context only (BEAM executions). The idea is to divide the work before executing it in order to reduce the overall execution time.

The process is the following:

The size of the data you work on is estimated. This part can be heuristic and not very precise.
From that size, the execution engine (runner for Beam) requests the mapper to split itself in N mappers with a subset of the overall work.
The leaf (final) mapper is used as a Producer (actual reader) factory.

This kind of component must be Serializable to be distributable.

Definition

A partition mapper requires three methods marked with specific annotations:

@Assessor for the evaluating method
@Split for the dividing method
@Emitter for the Producer factory

@Assessor

The Assessor method returns the estimated size of the data related to the component (depending its configuration). It must return a Number and must not take any parameter.

For example:

@Assessor
public long estimateDataSetByteSize() {
    return ....;
}

@Split

The Split method returns a collection of partition mappers and can take optionally a @PartitionSize long value as parameter, which is the requested size of the dataset per sub partition mapper.

For example:

@Split
public List<MyMapper> split(@PartitionSize final long desiredSize) {
    return ....;
}

@Emitter

The Emitter method must not have any parameter and must return a producer. It uses the partition mapper configuration to instantiate and configure the producer.

For example:

@Emitter
public MyProducer create() {
    return ....;
}

Producer

A Producer is a component interacting with a physical source. It produces input data for the processing flow.

A producer is a simple component that must have a @Producer method without any parameter. It can return any data:

@Producer
public MyData produces() {
    return ...;
}

Processor

A Processor is a component that converts incoming data to a different model.

A processor must have a method decorated with @ElementListener taking an incoming data and returning the processed data:

@ElementListener
public MyNewData map(final MyData data) {
    return ...;
}

Processors must be Serializable because they are distributed components.

If you just need to access data on a map-based ruleset, you can use JsonObject as parameter type.
From there, Talend Component Kit wraps the data to allow you to access it as a map. The parameter type is not enforced.
This means that if you know you will get a SuperCustomDto, then you can use it as parameter type. But for generic components that are reusable in any chain, it is highly encouraged to use JsonObject until you have an evaluation language-based processor that has its own way to access components.

For example:

@ElementListener
public MyNewData map(final JsonObject incomingData) {
    String name = incomingData.getString("name");
    int name = incomingData.getInt("age");
    return ...;
}

// equivalent to (using POJO subclassing)

public class Person {
    private String age;
    private int age;

    // getters/setters
}

@ElementListener
public MyNewData map(final Person person) {
    String name = person.getName();
    int age = person.getAge();
    return ...;
}

A processor also supports @BeforeGroup and @AfterGroup methods, which must not have any parameter and return void values. Any other result would be ignored. These methods are used by the runtime to mark a chunk of the data in a way which is estimated good for the execution flow size.

Because the size is estimated, the size of a group can vary. It is even possible to have groups of size 1.

It is recommended to batch records, for performance reasons:

@BeforeGroup
public void initBatch() {
    // ...
}

@AfterGroup
public void endBatch() {
    // ...
}

It is a good practice to support a maxBatchSize here and to commit before the end of the group, in case of a computed size that is way too big for your backend to handle.

Multiple outputs

In some cases, you may need to split the output of a processor in two. A common example is to have "main" and "reject" branches where part of the incoming data are passed to a specific bucket to be processed later.

To do that, you can use @Output as replacement of the returned value:

@ElementListener
public void map(final MyData data, @Output final OutputEmitter<MyNewData> output) {
    output.emit(createNewData(data));
}

Alternatively, you can pass a string that represents the new branch:

@ElementListener
public void map(final MyData data,
                @Output final OutputEmitter<MyNewData> main,
                @Output("rejected") final OutputEmitter<MyNewDataWithError> rejected) {
    if (isRejected(data)) {
        rejected.emit(createNewData(data));
    } else {
        main.emit(createNewData(data));
    }
}

// or

@ElementListener
public MyNewData map(final MyData data,
                    @Output("rejected") final OutputEmitter<MyNewDataWithError> rejected) {
    if (isSuspicious(data)) {
        rejected.emit(createNewData(data));
        return createNewData(data); // in this case the processing continues but notifies another channel
    }
    return createNewData(data);
}

Multiple inputs

Having multiple inputs is similar to having multiple outputs, except that an OutputEmitter wrapper is not needed:

@ElementListener
public MyNewData map(@Input final MyData data, @Input("input2") final MyData2 data2) {
    return createNewData(data1, data2);
}

@Input takes the input name as parameter. If no name is set, it defaults to the "main (default)" input branch. It is recommended to use the default branch when possible and to avoid naming branches according to the component semantic.

Output

An Output is a Processor that does not return any data.

Conceptually, an output is a data listener. It matches the concept of processor. Being the last component of the execution chain or returning no data makes your processor an output component:

@ElementListener
public void store(final MyData data) {
    // ...
}

Combiners

Currently, Talend Component Kit does not allow you to define a Combiner. A combiner is the symmetric part of a partition mapper and allows to aggregate results in a single partition.

Family and component icons

Every component family and component needs to have a representative icon.
You can use one of the icons provided by the framework or you can use a custom icon.

For the component family the icon is defined in the package-info.java file. For the component itself, you need to declare it in the component class.

To use a custom icon, you need to have the icon file placed in the resources/icons folder of the project. The icon file needs to have a name following the convention IconName_icon32.png, where you can replace IconName by the name of your choice.

@Icon(value = Icon.IconType.CUSTOM, custom = "IconName")

Configuring components

Components are configured using their constructor parameters. They can all be marked with the @Option property, which lets you give a name to parameters.

For the name to be correct, you must follow these guidelines:

Use a valid Java name.
Do not include any . character in it.
Do not start the name with a $.
Defining a name is optional. If you don’t set a specific name, it defaults to the bytecode name, which can require you to compile with a -parameter flag to not end up with names such as arg0, arg1, and so on.

Parameter types can be primitives or complex objects with fields decorated with @Option exactly like method parameters.

It is recommended to use simple models which can be serialized in order to ease serialized component implementations.

For example:

class FileFormat implements Serializable {
    @Option("type")
    private FileType type = FileType.CSV;

    @Option("max-records")
    private int maxRecords = 1024;
}

@PartitionMapper(family = "demo", name = "file-reader")
public MyFileReader(@Option("file-path") final File file,
                    @Option("file-format") final FileFormat format) {
    // ...
}

Using this kind of API makes the configuration extensible and component-oriented, which allows you to define all you need.

The instantiation of the parameters is done from the properties passed to the component.

Examples of option names:

Option name	Valid
myName
my_name
my.name
$myName

Option name

Valid

myName

my_name

my.name

$myName

Primitives

A primitive is a class which can be directly converted from a String to the expected type.

It includes all Java primitives, like the String type itself, but also all types with a org.apache.xbean.propertyeditor.Converter:

BigDecimal
BigInteger
File
InetAddress
ObjectName
URI
URL
Pattern

Complex object mapping

The conversion from property to object uses the Dot notation.

For example, assuming the method parameter was configured with @Option("file"):

file.path = /home/user/input.csv
file.format = CSV

matches

public class FileOptions {
    @Option("path")
    private File path;

    @Option("format")
    private Format format;
}

List case

Lists rely on an indexed syntax to define their elements.

For example, assuming that the list parameter is named files and that the elements are of the FileOptions type, you can define a list of two elements as follows:

files[0].path = /home/user/input1.csv
files[0].format = CSV
files[1].path = /home/user/input2.xml
files[1].format = EXCEL

Map case

Similarly to the list case, the map uses .key[index] and .value[index] to represent its keys and values:

// Map<String, FileOptions>
files.key[0] = first-file
files.value[0].path = /home/user/input1.csv
files.value[0].type = CSV
files.key[1] = second-file
files.value[1].path = /home/user/input2.xml
files.value[1].type = EXCEL

// Map<FileOptions, String>
files.key[0].path = /home/user/input1.csv
files.key[0].type = CSV
files.value[0] = first-file
files.key[1].path = /home/user/input2.xml
files.key[1].type = EXCEL
files.value[1] = second-file

Avoid using the Map type. For example, if you can configure your component with an object instead.

Defining Constraints and validations on the configuration

You can use metadata to specify that a field is required or has a minimum size, and so on. This is done using the validation metadata in the org.talend.sdk.component.api.configuration.constraint package:

API	Name	Parameter Type	Description	Supported Types	Metadata sample
@org.talend.sdk.component.api.configuration.constraint.Max	maxLength	double	Ensure the decorated option size is validated with a higher bound.	CharSequence	{"validation::maxLength":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Min	minLength	double	Ensure the decorated option size is validated with a lower bound.	CharSequence	{"validation::minLength":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Pattern	pattern	string	Validate the decorated string with a javascript pattern (even into the Studio).	CharSequence	{"validation::pattern":"test"}
@org.talend.sdk.component.api.configuration.constraint.Max	max	double	Ensure the decorated option size is validated with a higher bound.	Number, int, short, byte, long, double, float	{"validation::max":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Min	min	double	Ensure the decorated option size is validated with a lower bound.	Number, int, short, byte, long, double, float	{"validation::min":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Required	required	-	Mark the field as being mandatory.	Object	{"validation::required":"true"}
@org.talend.sdk.component.api.configuration.constraint.Max	maxItems	double	Ensure the decorated option size is validated with a higher bound.	Collection	{"validation::maxItems":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Min	minItems	double	Ensure the decorated option size is validated with a lower bound.	Collection	{"validation::minItems":"12.34"}
@org.talend.sdk.component.api.configuration.constraint.Uniques	uniqueItems	-	Ensure the elements of the collection must be distinct (kind of set).	Collection	{"validation::uniqueItems":"true"}

When using the programmatic API, metadata is prefixed by tcomp::. This prefix is stripped in the web for convenience, and the table above uses the web keys.

Also note that these validations are executed before the runtime is started (when loading the component instance) and that the execution will fail if they don’t pass. If somehow it breaks your application you can disable that validation on the JVM by setting the system property talend.component.configuration.validation.skip to true.

Marking a configuration as a particular type of data

It is common to classify the incoming data. It is similar to tagging data with several types. Data can commonly be categorized as follows:

Datastore: The data you need to connect to the backend.
Dataset: A datastore coupled with the data you need to execute an action.

API	Type	Description	Metadata sample
org.talend.sdk.component.api.configuration.type.DataSet	dataset	Mark a model (complex object) as being a dataset.	{"tcomp::configurationtype::type":"dataset","tcomp::configurationtype::name":"test"}
org.talend.sdk.component.api.configuration.type.DataStore	datastore	Mark a model (complex object) as being a datastore (connection to a backend).	{"tcomp::configurationtype::type":"datastore","tcomp::configurationtype::name":"test"}

The component family associated with a configuration type (datastore/dataset) is always the one related to the component using that configuration.

Those configuration types can be composed to provide one configuration item. For example, a dataset type often needs a datastore type to be provided. A datastore type (that provides the connection information) is used to create a dataset type.

Those configuration types are also used at design time to create shared configurations that can be stored and used at runtime.

For example, in the case of a relational database that supports JDBC:

A datastore can be made of:
- a JDBC URL
- a username
- a password.
A dataset can be made of:
- a datastore (that provides the data required to connect to the database)
- a table name
- data.

The component server scans all configuration types and returns a configuration type index. This index can be used for the integration into the targeted platforms (Studio, web applications, and so on).

The configuration type index is represented as a flat tree that contains all the configuration types, which themselves are represented as nodes and indexed by ID.

Every node can point to other nodes. This relation is represented as an array of edges that provides the child IDs.

As an illustration, a configuration type index for the example above can be defined as follows:

{nodes: {
             "idForDstore": { datastore:"datastore data", edges:[id:"idForDset"] },
             "idForDset":   { dataset:"dataset data" }
    }
}

Defining links between properties

If you need to define a binding between properties, you can use a set of annotations:

API	Name	Description	Metadata Sample
@org.talend.sdk.component.api.configuration.condition.ActiveIf	if	If the evaluation of the element at the location matches value then the element is considered active, otherwise it is deactivated.	{"condition::if::target":"test","condition::if::value":"value1,value2"}
@org.talend.sdk.component.api.configuration.condition.ActiveIfs	ifs	Allows to set multiple visibility conditions on the same property.	{"condition::if::value::0":"value1,value2","condition::if::value::1":"SELECTED","condition::if::target::0":"sibling1","condition::if::target::1":"../../other"}

The target element location is specified as a relative path to the current location, using Unix path characters. The configuration class delimiter is /.
The parent configuration class is specified by ...
Thus, ../targetProperty denotes a property, which is located in the parent configuration class and is named targetProperty.

When using the programmatic API, metadata is prefixed with tcomp::. This prefix is stripped in the web for convenience, and the previous table uses the web keys.

Adding hints about the rendering based on configuration/component knowledge

In some cases, you may need to add metadata about the configuration to let the UI render that configuration properly.
For example, a password value that must be hidden and not a simple clear input box. For these cases - if you want to change the UI rendering - you can use a particular set of annotations:

API	Description	Generated property metadata
@org.talend.sdk.component.api.configuration.ui.DefaultValue	Provide a default value the UI can use - only for primitive fields.	{"ui::defaultvalue::value":"test"}
@org.talend.sdk.component.api.configuration.ui.OptionsOrder	Allows to sort a class properties.	{"ui::optionsorder::value":"value1,value2"}
@org.talend.sdk.component.api.configuration.ui.layout.AutoLayout	Request the rendered to do what it thinks is best.	{"ui::autolayout":"true"}
@org.talend.sdk.component.api.configuration.ui.layout.GridLayout	Advanced layout to place properties by row, this is exclusive with `@OptionsOrder`.	{"ui::gridlayout::value1::value":"first\|second,third","ui::gridlayout::value2::value":"first\|second,third"}
@org.talend.sdk.component.api.configuration.ui.layout.GridLayouts	Allow to configure multiple grid layouts on the same class, qualified with a classifier (name)	{"ui::gridlayout::Advanced::value":"another","ui::gridlayout::Main::value":"first\|second,third"}
@org.talend.sdk.component.api.configuration.ui.layout.HorizontalLayout	Put on a configuration class it notifies the UI an horizontal layout is preferred.	{"ui::horizontallayout":"true"}
@org.talend.sdk.component.api.configuration.ui.layout.VerticalLayout	Put on a configuration class it notifies the UI a vertical layout is preferred.	{"ui::verticallayout":"true"}
@org.talend.sdk.component.api.configuration.ui.widget.Code	Mark a field as being represented by some code widget (vs textarea for instance).	{"ui::code::value":"test"}
@org.talend.sdk.component.api.configuration.ui.widget.Credential	Mark a field as being a credential. It is typically used to hide the value in the UI.	{"ui::credential":"true"}
@org.talend.sdk.component.api.configuration.ui.widget.Structure	Mark a List<String> or Map<String, String> field as being represented as the component data selector (field names generally or field names as key and type as value).	{"ui::structure::type":"null","ui::structure::discoverSchema":"test","ui::structure::value":"test"}
@org.talend.sdk.component.api.configuration.ui.widget.TextArea	Mark a field as being represented by a textarea(multiline text input).	{"ui::textarea":"true"}

When using the programmatic API, metadata is prefixed with tcomp::. This prefix is stripped in the web for convenience, and the previous table uses the web keys.

Target support should cover org.talend.core.model.process.EParameterFieldType but you need to ensure that the web renderer is able to handle the same widgets.

Widget and validation gallery

Widgets

Name Code Studio Rendering Web Rendering

Name	Code	Studio Rendering	Web Rendering
Input/Text	`@Option String config;`
Password	`@Option @Credential String config;`
Textarea	`@Option @Textarea String config;`
Checkbox	`@Option Boolean config;`
List	`@Option @Proposable("valuesProvider") String config; /** service class */ @DynamicValues("valuesProvider") public Values vendors(){ return new Values(asList(new Values.Item("1", "Delete"), new Values.Item("2", "Insert") new Values.Item("3", "Update"))); }`
List	`@Option ActionEnum config; /** Define enum */ enum ActionEnum {Delete, Insert, Update}`
Table	`@Option Object config;`
Code	`@Code("java") @Option String config;`
Schema	`@Option @Structure List<String> config;`

Input/Text

@Option
String config;

Password

@Option
@Credential
String config;

Textarea

@Option
@Textarea
String config;

Checkbox

@Option
Boolean config;

List

@Option
@Proposable("valuesProvider")
String config;
/** service class */
@DynamicValues("valuesProvider")
public Values vendors(){
    return new Values(asList(new Values.Item("1", "Delete"),
                      new Values.Item("2", "Insert")
                      new Values.Item("3", "Update")));
 }

List

@Option
ActionEnum config;
/** Define enum */
enum ActionEnum {Delete,
    Insert,
    Update}

Table

@Option
Object config;

Code

@Code("java")
@Option
String config;

Schema

@Option
@Structure
List<String> config;

Validations

Name Code Studio Rendering Web Rendering

Name	Code	Studio Rendering	Web Rendering
Property validation	`/** configuration class / @Option @Validable("url") String config; /* service class */ @AsyncValidation("url") ValidationResult doValidate(String url) { }`
Property validation with Pattern	`/** configuration class */ @Option @Pattern("/^[a-zA-Z\\-]+$/") String username;`
Data store validation	`@Datastore @Checkable public class config { /** config .../ } /* service class */ @HealthCheck public HealthCheckStatus testConnection(){ }`

Property validation

/** configuration class */
@Option
@Validable("url")
String config;

/** service class */
@AsyncValidation("url")
ValidationResult doValidate(String url) {
}

Property validation with Pattern

/** configuration class */
@Option
@Pattern("/^[a-zA-Z\\-]+$/")
String username;

Data store validation

@Datastore
@Checkable
public class config {
/** config ...*/
}

/** service class */
@HealthCheck
public HealthCheckStatus testConnection(){

}

You can also use other types of validation that are similar to @Pattern:

@Min, @Max for numbers.
@Unique for collection values.
@Required for a required configuration.

Registering components

As you may have read in the Getting Started, you need an annotation to register your component through the family method. Multiple components can use the same family value but the family + name pair must be unique for the system.

In order to share the same component family name and to avoid repetitions in all family methods, you can use the @Components annotation on the root package of your component. It allows you to define the component family and the categories the component belongs to (Misc by default if not set).

Here is a sample package-info.java:

@Components(name = "my_component_family", categories = "My Category")
package org.talend.sdk.component.sample;

import org.talend.sdk.component.api.component.Components;

Another example with an existing component:

@Components(name = "Salesforce", categories = {"Business", "Cloud"})
package org.talend.sdk.component.sample;

import org.talend.sdk.component.api.component.Components;

Components metadata

Components can require metadata to be integrated in Talend Studio or Cloud platforms. Metadata is set on the component class and belongs to the org.talend.sdk.component.api.component package.

API Description

API	Description
@Icon	Sets an icon key used to represent the component. You can use a custom key with the `custom()` method but the icon may not be rendered properly.
@Version	Sets the component version. 1 by default.

@Icon

Sets an icon key used to represent the component. You can use a custom key with the custom() method but the icon may not be rendered properly.

@Version

Sets the component version. 1 by default.

Example:

@Icon(FILE_XML_O)
@PartitionMapper(name = "jaxbInput")
public class JaxbPartitionMapper implements Serializable {
    // ...
}

Managing version configuration

If some changes impact the configuration, they can be managed through a migration handler at the component level (enabling trans-model migration support).

The @Version annotation supports a migrationHandler method which migrates the incoming configuration to the current model.

For example, if the filepath configuration entry from v1 changed to location in v2, you can remap the value in your MigrationHandler implementation.

A best practice is to split migrations into services that you can inject in the migration handler (through constructor) rather than managing all migrations directly in the handler. For example:

// full component code structure skipped for brievity, kept only migration part
@Version(value = 3, migrationHandler = MyComponent.Migrations.class)
public class MyComponent {
    // the component code...


    private interface VersionConfigurationHandler {
        Map<String, String> migrate(Map<String, String> incomingData);
    }

    public static class Migrations {
        private final List<VersionConfigurationHandler> handlers;

        // VersionConfigurationHandler implementations are decorated with @Service
        public Migrations(final List<VersionConfigurationHandler> migrations) {
            this.handlers = migrations;
            this.handlers.sort(/*some custom logic*/);
        }

        @Override
        public Map<String, String> migrate(int incomingVersion, Map<String, String> incomingData) {
            Map<String, String> out = incomingData;
            for (MigrationHandler handler : handlers) {
                out = handler.migrate(out);
            }
        }
    }
}

What is important to notice in this snippet is not the way the code is organized, but rather the fact that you can organize your migrations the way that best fits your component.

If you need to apply migrations in a specific order, make sure that they are sorted.

Consider this API as a migration callback rather than a migration API.
Adjust the migration code structure you need behind the MigrationHandler, based on your component requirements, using service injection.

@PartitionMapper

@PartitionMapper marks a partition mapper:

@PartitionMapper(family = "demo", name = "my_mapper")
public class MyMapper {
}

@Emitter

@Emitter is a shortcut for @PartitionMapper when you don’t support distribution. It enforces an implicit partition mapper execution with an assessor size of 1 and a split returning itself.

@Emitter(family = "demo", name = "my_input")
public class MyInput {
}

@Processor

A method decorated with @Processor is considered as a producer factory:

@Processor(family = "demo", name = "my_processor")
public class MyProcessor {
}

Component internationalization

In common cases, you can store messages using a properties file in your component module to use internationalization.

Store the properties file in the same package as the related components and name it Messages. For example, org.talend.demo.MyComponent uses org.talend.demo.Messages[locale].properties.

Default components keys

Out of the box components are internationalized using the same location logic for the resource bundle. The supported keys are:

Name Pattern Description

Name Pattern	Description
${family}._displayName	Display name of the family
${family}.${configurationType}.${name}._displayName	Display name of a configuration type (dataStore or dataSet)
${family}.${component_name}._displayName	Display name of the component (used by the GUIs)
${property_path}._displayName	Display name of the option.
${simple_class_name}.${property_name}._displayName	Display name of the option using its class name.
${enum_simple_class_name}.${enum_name}._displayName	Display name of the `enum_name` value of the `enum_simple_class_name` enumeration.
${property_path}._placeholder	Placeholder of the option.

${family}._displayName

Display name of the family

${family}.${configurationType}.${name}._displayName

Display name of a configuration type (dataStore or dataSet)

${family}.${component_name}._displayName

Display name of the component (used by the GUIs)

${property_path}._displayName

Display name of the option.

${simple_class_name}.${property_name}._displayName

Display name of the option using its class name.

${enum_simple_class_name}.${enum_name}._displayName

Display name of the enum_name value of the enum_simple_class_name enumeration.

${property_path}._placeholder

Placeholder of the option.

Example of configuration for a component named list and belonging to the memory family (@Emitter(family = "memory", name = "list")):

memory.list._displayName = Memory List

Configuration classes can be translated using the simple class name in the messages properties file. This is useful in case of common configurations shared by multiple components.

For example, if you have a configuration class as follows :

public class MyConfig {

    @Option
    private String host;

    @Option
    private int port;
}

You can give it a translatable display name by adding ${simple_class_name}.${property_name}._displayName to Messages.properties under the same package as the configuration class.

MyConfig.host._displayName = Server Host Name
MyConfig.host._placeholder = Enter Server Host Name...

MyConfig.port._displayName = Server Port
MyConfig.port._placeholder = Enter Server Port...

If you have a display name using the property path, it overrides the display name defined using the simple class name. This rule also applies to placeholders.

Components Packaging

Component Loading

Talend Component scanning is based on plugins. To make sure that plugins can be developed in parallel and avoid conflicts, plugins need to be isolated (component or group of components in a single jar/plugin).

Multiple options are available:

Graph classloading: this option allows you to link the plugins and dependencies together dynamically in any direction.
For example, the graph classloading can be illustrated by OSGi containers.
Tree classloading: a shared classloader inherited by plugin classloaders. However, plugin classloader classes are not seen by the shared classloader, nor by other plugins.
For example, he tree classloading is commonly used by Servlet containers where plugins are web applications.
Flat classpath: listed for completeness but rejected by design because it doesn’t comply with this requirement.

In order to avoid much complexity added by this layer, Talend Component Kit relies on a tree classloading. The advantage is that you don’t need to define the relationship with other plugins/dependencies, because it is built-in.

Here is a representation of this solution:

classloader layout

The shared area contains Talend Component Kit API, which only contains by default the classes shared by the plugins.

Then, each plugin is loaded with its own classloader and dependencies.

Packaging a plugin

This section explains the overall way to handle dependencies but the Talend Maven plugin provides a shortcut for that.

A plugin is a JAR file that was enriched with the list of its dependencies. By default, Talend Component Kit runtime is able to read the output of maven-dependency-plugin in TALEND-INF/dependencies.txt. You just need to make sure that your component defines the following plugin:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-dependency-plugin</artifactId>
  <version>3.0.2</version>
  <executions>
    <execution>
      <id>create-TALEND-INF/dependencies.txt</id>
      <phase>process-resources</phase>
      <goals>
        <goal>list</goal>
      </goals>
      <configuration>
        <outputFile>${project.build.outputDirectory}/TALEND-INF/dependencies.txt</outputFile>
      </configuration>
    </execution>
  </executions>
</plugin>

Once build, check the JAR file and look for the following lines:

$ unzip -p target/mycomponent-1.0.0-SNAPSHOT.jar TALEND-INF/dependencies.txt

The following files have been resolved:
   org.talend.sdk.component:component-api:jar:1.0.0-SNAPSHOT:provided
   org.apache.geronimo.specs:geronimo-annotation_1.3_spec:jar:1.0:provided
   org.superbiz:awesome-project:jar:1.2.3:compile
   junit:junit:jar:4.12:test
   org.hamcrest:hamcrest-core:jar:1.3:test

What is important to see is the scope related to the artifacts:

The APIs (component-api and geronimo-annotation_1.3_spec) are provided because you can consider them to be there when executing (they come with the framework).
Your specific dependencies (awesome-project in the example above) are marked as compile: they are included as needed dependencies by the framework (note that using runtime works too).
the other dependencies are ignored. For example, test dependencies.

Packaging an application

Even if a flat classpath deployment is possible, it is not recommended because it would then reduce the capabilities of the components.

Dependencies

The way the framework resolves dependencies is based on a local Maven repository layout. As a quick reminder, it looks like:

.
├── groupId1
│   └── artifactId1
│       ├── version1
│       │   └── artifactId1-version1.jar
│       └── version2
│           └── artifactId1-version2.jar
└── groupId2
    └── artifactId2
        └── version1
            └── artifactId2-version1.jar

This is all the layout the framework uses. The logic converts t-uple {groupId, artifactId, version, type (jar)} to the path in the repository.

Talend Component Kit runtime has two ways to find an artifact:

From the file system based on a configured Maven 2 repository.
From a fat JAR (uber JAR) with a nested Maven repository under MAVEN-INF/repository.

The first option uses either ${user.home}/.m2/repository default) or a specific path configured when creating a ComponentManager. The nested repository option needs some configuration during the packaging to ensure the repository is correctly created.

Creating a nested Maven repository with maven-shade-plugin

To create the nested MAVEN-INF/repository repository, you can use the nested-maven-repository extension:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>3.0.0</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <transformers>
          <transformer implementation="org.talend.sdk.component.container.maven.shade.ContainerDependenciesTransformer">
            <session>${session}</project>
          </transformer>
        </transformers>
      </configuration>
    </execution>
  </executions>
  <dependencies>
    <dependency>
      <groupId>org.talend.sdk.component</groupId>
      <artifactId>nested-maven-repository</artifactId>
      <version>${the.plugin.version}</version>
    </dependency>
  </dependencies>
</plugin>

Listing needed plugins

Plugins are usually programmatically registered. If you want to make some of them automatically available, you need to generate a TALEND-INF/plugins.properties file that maps a plugin name to coordinates found with the Maven mechanism described above.

You can enrich maven-shade-plugin to do it:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>3.0.0</version>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <transformers>
          <transformer implementation="org.talend.sdk.component.container.maven.shade.PluginTransformer">
            <session>${session}</project>
          </transformer>
        </transformers>
      </configuration>
    </execution>
  </executions>
  <dependencies>
    <dependency>
      <groupId>org.talend.sdk.component</groupId>
      <artifactId>nested-maven-repository</artifactId>
      <version>${the.plugin.version}</version>
    </dependency>
  </dependencies>
</plugin>

`maven-shade-plugin` extensions

Here is a final job/application bundle based on maven-shade-plugin:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-shade-plugin</artifactId>
  <version>3.0.0</version>
  <configuration>
    <createDependencyReducedPom>false</createDependencyReducedPom>
    <filters>
      <filter>
        <artifact>*:*</artifact>
        <excludes>
          <exclude>META-INF/.SF</exclude>
          <exclude>META-INF/.DSA</exclude>
          <exclude>META-INF/*.RSA</exclude>
        </excludes>
      </filter>
    </filters>
  </configuration>
  <executions>
    <execution>
      <phase>package</phase>
      <goals>
        <goal>shade</goal>
      </goals>
      <configuration>
        <shadedClassifierName>shaded</shadedClassifierName>
        <transformers>
          <transformer
              implementation="org.talend.sdk.component.container.maven.shade.ContainerDependenciesTransformer">
            <session>${session}</session>
            <userArtifacts>
              <artifact>
                <groupId>org.talend.sdk.component</groupId>
                <artifactId>sample-component</artifactId>
                <version>1.0</version>
                <type>jar</type>
              </artifact>
            </userArtifacts>
          </transformer>
          <transformer implementation="org.talend.sdk.component.container.maven.shade.PluginTransformer">
            <session>${session}</session>
            <userArtifacts>
              <artifact>
                <groupId>org.talend.sdk.component</groupId>
                <artifactId>sample-component</artifactId>
                <version>1.0</version>
                <type>jar</type>
              </artifact>
            </userArtifacts>
          </transformer>
        </transformers>
      </configuration>
    </execution>
  </executions>
  <dependencies>
    <dependency>
      <groupId>org.talend.sdk.component</groupId>
      <artifactId>nested-maven-repository-maven-plugin</artifactId>
      <version>${the.version}</version>
    </dependency>
  </dependencies>
</plugin>

The configuration unrelated to transformers depends on your application.

ContainerDependenciesTransformer embeds a Maven repository and PluginTransformer to create a file that lists (one per line) artifacts (representing plugins).

Both transformers share most of their configuration:

session: must be set to ${session}. This is used to retrieve dependencies.
scope: a comma-separated list of scopes to include in the artifact filtering (note that the default will rely on provided but you can replace it by compile, runtime, runtime+compile, runtime+system or test).
include: a comma-separated list of artifacts to include in the artifact filtering.
exclude: a comma-separated list of artifacts to exclude in the artifact filtering.
userArtifacts: a list of artifacts (groupId, artifactId, version, type - optional, file - optional for plugin transformer, scope - optional) which can be forced inline. This parameter is mainly useful for PluginTransformer.
includeTransitiveDependencies: should transitive dependencies of the components be included. Set to true by default.
includeProjectComponentDependencies: should project component dependencies be included. Set to false by default. It is not needed when a job project uses isolation for components.
userArtifacts: set of component artifacts to include.

With the component tooling, it is recommended to keep default locations.
Also if you need to use project dependencies, you can need to refactor your project structure to ensure component isolation.
Talend Component Kit lets you handle that part but the recommended practice is to use userArtifacts for the components instead of project <dependencies>.

ContainerDependenciesTransformer

ContainerDependenciesTransformer specific configuration is as follows:

repositoryBase: base repository location (MAVEN-INF/repository by default).
ignoredPaths: a comma-separated list of folders not to create in the output JAR. This is common for folders already created by other transformers/build parts.

PluginTransformer

ContainerDependenciesTransformer specific configuration is the following one:

pluginListResource: base repository location (default to TALEND-INF/plugins.properties`).

For example, if you want to list only the plugins you use, you can configure this transformer as follows:

<transformer implementation="org.talend.sdk.component.container.maven.shade.PluginTransformer">
  <session>${session}</session>
  <include>org.talend.sdk.component:component-x,org.talend.sdk.component:component-y,org.talend.sdk.component:component-z</include>
</transformer>

Component scanning rules and default exclusions

The framework uses two kind of filterings when scanning your component. One based on the JAR name and one based on the package name. Make sure that your component definitions (including services) are in a scanned module if they are not registered manually using ComponentManager.instance().addPlugin(), and that the component package is not excluded.

Jars Scanning

To find components the framework can scan the classpath but in this case, to avoid to scan the whole classpath which can be really huge an impacts a lot the startup time, several jars are excluded out of the box.

These jars use the following prefix:

ApacheJMeter
FastInfoset
HdrHistogram
HikariCP
PDFBox
RoaringBitmap-
XmlSchema-
accessors-smart
activation-
activeio-
activemq-
aeron
aether-
agrona
akka-
animal-sniffer-annotation
annotation
ant-
antlr-
aopalliance-
apache-el
apacheds-
api-asn1-
api-util-
apiguardian-api-
app-
archaius-core
args4j-
arquillian-
asciidoctorj-
asm-
aspectj
async-http-client-
avalon-framework-
avro-
awaitility-
axis-
axis2-
base64-
batchee-jbatch
batik-
bcpkix
bcprov-
beam-model-
beam-runners-
beam-sdks-
bonecp
bootstrap.jar
brave-
bsf-
build-link
bval
byte-buddy
c3p0-
cache
carrier
cassandra-driver-core
catalina-
catalina.jar
cats
cdi-
cglib-
charsets.jar
chill
classindex
classmate
classutil
classycle
cldrdata
commands-
common-
commons-
component-api
component-form
component-runtime
component-server
component-spi
component-studio
components-api
components-common
compress-lzf
config
constructr
container-core
contenttype
coverage-agent
cryptacular-
cssparser-
curator-
cxf-
daikon
databinding
dataquality
debugger-agent
deltaspike-
deploy.jar
derby-
derbyclient-
derbynet-
dnsns
dom4j
draw2d
ecj-
eclipselink-
ehcache-
el-api
enunciate-core-annotations
error_prone_annotations
expressions
fastutil
feign-core
feign-hystrix
feign-slf4j
filters-helpers
findbugs-
fluentlenium-core
freemarker-
fusemq-leveldb-
gef-
geocoder
geronimo-
gmbal
google-
gpars-
gragent.jar
graph
grizzled-scala
grizzly-
groovy-
grpc-
gson-
guava-
guice-
h2-
hadoop-
hamcrest-
hawtbuf-
hawtdispatch-
hawtio-
hawtjni-runtime
help-
hibernate-
hk2-
howl-
hsqldb-
htmlunit-
htrace-
httpclient-
httpcore-
httpmime
hystrix
iban4j-
icu4j-
idb-
idea_rt.jar
instrumentation-api
istack-commons-runtime-
ivy-
j2objc-annotations
jBCrypt
jaccess
jackson-
janino-
jansi-
jasper-el.jar
jasper.jar
jasypt-
java-atk-wrapper
java-support-
java-xmlbuilder-
javacsv
javaee-
javaee-api
javassist-
javaws.jar
javax.
jaxb-
jaxp-
jbake-
jboss-
jbossall-
jbosscx-
jbossjts-
jbosssx-
jcache
jce.jar
jcip-annotations
jcl-over-slf4j-
jcommander-
jdbcdslog-1
jersey-
jets3t
jettison-
jetty-
jface
jfairy
jffi
jfr.jar
jfxrt.jar
jfxswt
jjwt
jline
jmdns-
jmustache
jna-
jnr-
jobs-
joda-convert
joda-time-
johnzon-
jolokia-
jopt-simple
jruby-
json-
json4s-
jsonb-api
jsoup-
jsp-api
jsr
jsse.jar
jta
jul-to-slf4j-
juli-
junit-
junit5-
jwt
jython
kafka
kahadb-
kotlin-runtime
kryo
leveldb
libphonenumber
lift-json
lmdbjava
localedata
log4j-
logback
logging-event-layout
logkit-
lombok
lucene
lz4
machinist
macro-compat
mail-
management-
mapstruct-
maven-
mbean-annotation-api-
meecrowave-
mesos-
metrics-
microprofile-config-api-
mimepull-
mina-
minlog
mockito-core
mqtt-client-
multiverse-core-
mx4j-
myfaces-
mysql-connector-java-
nashorn
neethi-
neko-htmlunit
nekohtml-
netflix
netty-
nimbus-jose-jwt
objenesis-
okhttp
okio
openjpa-
openmdx-
opensaml-
opentest4j-
openwebbeans-
openws-
ops4j-
org.apache.aries
org.apache.commons
org.apache.log4j
org.eclipse.
org.junit.
org.osgi.core-
org.osgi.enterprise
org.talend
orient-commons-
orientdb-core-
orientdb-nativeos-
oro-
osgi
paranamer
pax-url
play
plexus-
plugin.jar
poi-
postgresql
preferences-
prefixmapper
protobuf-
py4j-
pyrolite-
qdox-
quartz-2
quartz-openejb-
reactive-streams
reflectasm-
registry-
resources.jar
rhino
ribbon
rmock-
routes-compiler
routines
rt.jar
runners
runtime-
rxjava
rxnetty
saaj-
sac-
scala
scalap
scalatest
scannotation-
selenium
serializer-
serp-
service-common
servlet-api-
servo-
shaded
shrinkwrap-
sisu-guice
sisu-inject
slf4j-
slick
smack-
smackx-
snakeyaml-
snappy-
spark-
specs2
spring-
sshd-
ssl-config-core
stax-api-
stax2-api-
stream
sunec.jar
sunjce_provider
sunpkcs11
surefire-
swagger-
swizzle-
sxc-
system-rules
tachyon-
talend-icon
test-interface
testng-
tomcat
tomee-
tools.jar
twirl
twitter4j-
tyrex
uncommons
unused
validation-api-
velocity-
wagon-
webbeans-
websocket
woodstox-core
workbench
ws-commons-util-
wsdl4j-
wss4j-
wstx-asl-
xalan-
xbean-
xercesImpl-
xml-apis-
xml-resolver-
xmlbeans-
xmlenc-
xmlgraphics-
xmlpull-
xmlrpc-
xmlschema-
xmlsec-
xmltooling-
xmlunit-
xstream-
xz-
zipfs.jar
zipkin-
ziplock-
zkclient
zookeeper-

Package Scanning

Since the framework can be used in the case of fatjars or shades, and because it still uses scanning, it is important to ensure we don’t scan the whole classes for performances reason.

Therefore, the following packages are ignored:

avro.shaded
com.codehale.metrics
com.ctc.wstx
com.datastax.driver.core
com.fasterxml.jackson.annotation
com.fasterxml.jackson.core
com.fasterxml.jackson.databind
com.fasterxml.jackson.dataformat
com.fasterxml.jackson.module
com.google.common
com.google.thirdparty
com.ibm.wsdl
com.jcraft.jsch
com.kenai.jffi
com.kenai.jnr
com.sun.istack
com.sun.xml.bind
com.sun.xml.messaging.saaj
com.sun.xml.txw2
com.thoughtworks
io.jsonwebtoken
io.netty
io.swagger.annotations
io.swagger.config
io.swagger.converter
io.swagger.core
io.swagger.jackson
io.swagger.jaxrs
io.swagger.model
io.swagger.models
io.swagger.util
javax
jnr
junit
net.sf.ehcache
net.shibboleth.utilities.java.support
org.aeonbits.owner
org.apache.activemq
org.apache.beam
org.apache.bval
org.apache.camel
org.apache.catalina
org.apache.commons.beanutils
org.apache.commons.cli
org.apache.commons.codec
org.apache.commons.collections
org.apache.commons.compress
org.apache.commons.dbcp2
org.apache.commons.digester
org.apache.commons.io
org.apache.commons.jcs.access
org.apache.commons.jcs.admin
org.apache.commons.jcs.auxiliary
org.apache.commons.jcs.engine
org.apache.commons.jcs.io
org.apache.commons.jcs.utils
org.apache.commons.lang
org.apache.commons.lang3
org.apache.commons.logging
org.apache.commons.pool2
org.apache.coyote
org.apache.cxf
org.apache.geronimo.javamail
org.apache.geronimo.mail
org.apache.geronimo.osgi
org.apache.geronimo.specs
org.apache.http
org.apache.jcp
org.apache.johnzon
org.apache.juli
org.apache.logging.log4j.core
org.apache.logging.log4j.jul
org.apache.logging.log4j.util
org.apache.logging.slf4j
org.apache.meecrowave
org.apache.myfaces
org.apache.naming
org.apache.neethi
org.apache.openejb
org.apache.openjpa
org.apache.oro
org.apache.tomcat
org.apache.tomee
org.apache.velocity
org.apache.webbeans
org.apache.ws
org.apache.wss4j
org.apache.xbean
org.apache.xml
org.apache.xml.resolver
org.bouncycastle
org.codehaus.jackson
org.codehaus.stax2
org.codehaus.swizzle.Grep
org.codehaus.swizzle.Lexer
org.cryptacular
org.eclipse.jdt.core
org.eclipse.jdt.internal
org.fusesource.hawtbuf
org.h2
org.hamcrest
org.hsqldb
org.jasypt
org.jboss.marshalling
org.joda.time
org.jose4j
org.junit
org.jvnet.mimepull
org.metatype.sxc
org.objectweb.asm
org.objectweb.howl
org.openejb
org.opensaml
org.slf4j
org.swizzle
org.terracotta.context
org.terracotta.entity
org.terracotta.modules.ehcache
org.terracotta.statistics
org.tukaani
org.yaml.snakeyaml
serp

it is not recommanded but possible to add in your plugin module a TALEND-INF/scanning.properties file with classloader.includes and classloader.excludes entries to refine the scanning with custom rules. In such a case, exclusions win over inclusions.

Build tools

Maven Plugin

talend-component-maven-plugin helps you write components that match best practices and generate transparently metadata used by Talend Studio.

You can use it as follows:

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${component.version}</version>
</plugin>

This plugin is also an extension so you can declare it in your build/extensions block as:

<extension>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${component.version}</version>
</extension>

Used as an extension, the dependencies, validate and documentation goals will be set up.

Dependencies

The first goal is a shortcut for the maven-dependency-plugin. It creates the TALEND-INF/dependencies.txt file with the compile and runtime dependencies, allowing the component to use it at runtime:

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${component.version}</version>
  <executions>
    <execution>
      <id>talend-dependencies</id>
      <goals>
        <goal>dependencies</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Validate

This goal helps you validate the common programming model of the component. To activate it, you can use following execution definition:

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${component.version}</version>
  <executions>
    <execution>
      <id>talend-component-validate</id>
      <goals>
        <goal>validate</goal>
      </goals>
    </execution>
  </executions>
</plugin>

It is bound to the process-classes phase by default. When executed, it performs several validations that can be disabled by setting the corresponding flags to false in the <configuration> block of the execution:

Name Description Default

Name	Description	Default
validateInternationalization	Validates that resource bundles are presents and contain commonly used keys (for example, `_displayName`)	true
validateModel	Ensures that components pass validations of the `ComponentManager` and Talend Component runtime	true
validateSerializable	Ensures that components are `Serializable`. This is a sanity check, the component is not actually serialized here. If you have a doubt, make sure to test it. It also checks that any `@Internationalized` class is valid and has its keys.	true
validateMetadata	Ensures that components have an `@Icon` and a `@Version` defined.	true
validateDataStore	Ensures that any `@DataStore` defines a `@HealthCheck`.	true
validateComponent	Ensures that the native programming model is respected. You can disable it when using another programming model like Beam.	true
validateActions	Validates action signatures for actions not tolerating dynamic binding (`@HealthCheck`, `@DynamicValues`, and so on). It is recommended to keep it set to `true`.	true
validateFamily	Validates the family by verifying that the package containing the `@Components` has a `@Icon` property defined.	true
validateDocumentation	Ensures that all components and `@Option` properties have a documentation using the `@Documentation` property.	true
validateLayout	Ensures that the layout is referencing existing options and properties.	true
validateOptionNames	Ensures that the option names are compliant with the framework. It is highly recommended and safer to keep it set to `true`.	true

validateInternationalization

Validates that resource bundles are presents and contain commonly used keys (for example, _displayName)

true

validateModel

Ensures that components pass validations of the ComponentManager and Talend Component runtime

true

validateSerializable

Ensures that components are Serializable. This is a sanity check, the component is not actually serialized here. If you have a doubt, make sure to test it. It also checks that any @Internationalized class is valid and has its keys.

true

validateMetadata

Ensures that components have an @Icon and a @Version defined.

true

validateDataStore

Ensures that any @DataStore defines a @HealthCheck.

true

validateComponent

Ensures that the native programming model is respected. You can disable it when using another programming model like Beam.

true

validateActions

Validates action signatures for actions not tolerating dynamic binding (@HealthCheck, @DynamicValues, and so on). It is recommended to keep it set to true.

true

validateFamily

Validates the family by verifying that the package containing the @Components has a @Icon property defined.

true

validateDocumentation

Ensures that all components and @Option properties have a documentation using the @Documentation property.

true

validateLayout

Ensures that the layout is referencing existing options and properties.

true

validateOptionNames

Ensures that the option names are compliant with the framework. It is highly recommended and safer to keep it set to true.

true

Documentation

This goal generates an Asciidoc file documenting your component from the configuration model (@Option) and the @Documentation property that you can add to options and to the component itself.

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${component.version}</version>
  <executions>
    <execution>
      <id>talend-component-documentation</id>
      <goals>
        <goal>asciidoc</goal>
      </goals>
    </execution>
  </executions>
</plugin>

Name Description Default

Name	Description	Default
level	Level of the root title.	2 (`==`)
output	Output folder path. It is recommended to keep it to the default value.	`${classes}/TALEND-INF/documentation.adoc`
formats	Map of the renderings to do. Keys are the format (`pdf` or `html`) and values the output paths.	-
attributes	Map of asciidoctor attributes when formats is set.	-
templateDir / templateEngine	Template configuration for the rendering.	-
title	Document title.	${project.name}
attachDocumentations	Allows to attach (and deploy) the documentations (`.adoc`, and `formats` keys) to the project.	true

level

Level of the root title.

2 (==)

output

Output folder path. It is recommended to keep it to the default value.

${classes}/TALEND-INF/documentation.adoc

formats

Map of the renderings to do. Keys are the format (pdf or html) and values the output paths.

attributes

Map of asciidoctor attributes when formats is set.

templateDir / templateEngine

Template configuration for the rendering.

title

Document title.

${project.name}

attachDocumentations

Allows to attach (and deploy) the documentations (.adoc, and formats keys) to the project.

true

If you use the plugin as an extension, you can add the talend.documentation.htmlAndPdf property and set it to true in your project to automatically get HTML and PDF renderings of the documentation.

Rendering your documentation

To render the generated documentation in HTML or PDF, you can use the Asciidoctor Maven plugin (or Gradle equivalent). You can configure both executions if you want both HTML and PDF renderings.

Make sure to execute the rendering after the documentation generation.

HTML rendering

If you prefer a HTML rendering, you can configure the following execution in the asciidoctor plugin. The example below:

Generates the components documentation in target/classes/TALEND-INF/documentation.adoc.
Renders the documentation as an HTML file stored in target/documentation/documentation.html.

<plugin> (1)
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${talend-component-kit.version}</version>
  <executions>
    <execution>
      <id>documentation</id>
      <phase>prepare-package</phase>
      <goals>
        <goal>asciidoc</goal>
      </goals>
    </execution>
  </executions>
</plugin>
<plugin> (2)
  <groupId>org.asciidoctor</groupId>
  <artifactId>asciidoctor-maven-plugin</artifactId>
  <version>1.5.6</version>
  <executions>
    <execution>
      <id>doc-html</id>
      <phase>prepare-package</phase>
      <goals>
        <goal>process-asciidoc</goal>
      </goals>
      <configuration>
        <sourceDirectory>${project.build.outputDirectory}/TALEND-INF</sourceDirectory>
        <sourceDocumentName>documentation.adoc</sourceDocumentName>
        <outputDirectory>${project.build.directory}/documentation</outputDirectory>
        <backend>html5</backend>
      </configuration>
    </execution>
  </executions>
</plugin>

PDF rendering

If you prefer a PDF rendering, you can configure the following execution in the asciidoctor plugin:

<plugin>
  <groupId>org.asciidoctor</groupId>
  <artifactId>asciidoctor-maven-plugin</artifactId>
  <version>1.5.6</version>
  <executions>
    <execution>
      <id>doc-html</id>
      <phase>prepare-package</phase>
      <goals>
        <goal>process-asciidoc</goal>
      </goals>
      <configuration>
        <sourceDirectory>${project.build.outputDirectory}/TALEND-INF</sourceDirectory>
        <sourceDocumentName>documentation.adoc</sourceDocumentName>
        <outputDirectory>${project.build.directory}/documentation</outputDirectory>
        <backend>pdf</backend>
      </configuration>
    </execution>
  </executions>
  <dependencies>
    <dependency>
      <groupId>org.asciidoctor</groupId>
      <artifactId>asciidoctorj-pdf</artifactId>
      <version>1.5.0-alpha.16</version>
    </dependency>
  </dependencies>
</plugin>

Including the documentation into a document

If you want to add some more content or a title, you can include the generated document into another document using Asciidoc include directive.

For example:

= Super Components
Super Writer
:toc:
:toclevels: 3
:source-highlighter: prettify
:numbered:
:icons: font
:hide-uri-scheme:
:imagesdir: images

include::{generated_doc}/documentation.adoc[]

To be able to do that, you need to pass the generated_doc attribute to the plugin. For example:

<plugin>
  <groupId>org.asciidoctor</groupId>
  <artifactId>asciidoctor-maven-plugin</artifactId>
  <version>1.5.6</version>
  <executions>
    <execution>
      <id>doc-html</id>
      <phase>prepare-package</phase>
      <goals>
        <goal>process-asciidoc</goal>
      </goals>
      <configuration>
        <sourceDirectory>${project.basedir}/src/main/asciidoc</sourceDirectory>
        <sourceDocumentName>my-main-doc.adoc</sourceDocumentName>
        <outputDirectory>${project.build.directory}/documentation</outputDirectory>
        <backend>html5</backend>
        <attributes>
          <generated_adoc>${project.build.outputDirectory}/TALEND-INF</generated_adoc>
        </attributes>
      </configuration>
    </execution>
  </executions>
</plugin>

This is optional but allows to reuse Maven placeholders to pass paths, which can be convenient in an automated build.

You can find more customization options on Asciidoctor website.

Testing a component web rendering

Testing the rendering of your component configuration into the Studio requires deploying the component in Talend Studio (refer to Studio Documentation.

In the case where you need to deploy your component into a Cloud (web) environment, you can test its web rendering by using the web goal of the plugin:

Run the mvn talend-component:web command.
Open the following URL in a web browser: localhost:8080.
Select the component form you want to see from the treeview on the left. The selected form is displayed on the right.

Two parameters are available with the plugin:

serverPort, which allows to change the default port (8080) of the embedded server.
serverArguments, that you can use to pass Meecrowave options to the server. Learn more about that configuration at openwebbeans.apache.org/meecrowave/meecrowave-core/cli.html.

Make sure to install the artifact before using this command because it reads the component JAR from the local Maven repository.

Generating inputs or outputs

The Mojo generate (Maven plugin goal) of the same plugin also embeds a generator that you can use to bootstrap any input or output component:

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${talend-component.version}</version>
  <executions>
    <execution> (1)
      <id>generate-input</id>
      <phase>generate-sources</phase>
      <goals>
        <goal>generate</goal>
      </goals>
      <configuration>
        <type>input</type>
      </configuration>
    </execution>
    <execution> (2)
      <id>generate-output</id>
      <phase>generate-sources</phase>
      <goals>
        <goal>generate</goal>
      </goals>
      <configuration>
        <type>output</type>
      </configuration>
    </execution>
  </executions>
</plugin>

1	The first execution generates an input (partition mapper + emitter).
2	the second execution generates an output.

It is intended to be used from the command line (or IDE Maven integration) as follows:

$ mvn talend-component:generate \
    -Dtalend.generator.type=[input|output] \ (1)
    [-Dtalend.generator.classbase=com.test.MyComponent] \ (2)
    [-Dtalend.generator.family=my-family] \ (3)
    [-Dtalend.generator.pom.read-only=false] (4)

1	Select the type of component you want: `input` to generate a mapper and an emitter, or `output` to generate an output processor.
2	Set the class name base (automatically suffixed by the component type). If not set, the package is guessed and the classname is based on the basedir name.
3	Set the component family to use. If not specified, it defaults to the basedir name and removes "component[s]" from it. for example, `my-component` leads to `my` as family, unless it is explicitly set.
4	Specify if the generator needs to add `component-api` to the POM, if not already there. If you already added it, you can set it to `false` directly in the POM.

For this command to work, you need to register the plugin as follows:

<plugin>
  <groupId>org.talend.sdk.component</groupId>
  <artifactId>talend-component-maven-plugin</artifactId>
  <version>${talend-component.version}</version>
</plugin>

Talend Component Archive

Component ARchive (.car) is the way to bundle a component to share it in the Talend ecosystem. It is a plain Java ARchive (.jar) containing a metadata file and a nested Maven repository containing the component and its depenencies.

mvn talend-component:car

This command creates a .car file in your build directory. This file can be shared on Talend platforms.

This CAR is executable and exposes the studio-deploy command which takes a Talend Studio home path as parameter. When executed, it installs the dependencies into the Studio and registers the component in your instance. For example:

# for a studio
java -jar mycomponent.car studio-deploy /path/to/my/studio
or
java -jar mycomponent.car studio-deploy --location /path/to/my/studio

# for a m2 provisioning
java -jar mycomponent.car maven-deploy /path/to/.m2/repository
or
java -jar mycomponent.car maven-deploy --location /path/to/.m2/repository

You can also upload the dependencies to your Nexus server using the following command:

java -jar mycomponent.car deploy-to-nexus --url <nexus url> --repo <repository name> --user <username> --pass <password> --threads <parallel threads number> --dir <temp directory>

In this command, Nexus URL and repository name are mandatory arguments. All other arguments are optional. If arguments contain spaces or special symbols, you need to quote the whole value of the argument. For example:

--pass "Y0u will \ not G4iess i' ^"

Gradle Plugin

gradle-talend-component helps you write components that match the best practices. It is inspired from the Maven plugin and adds the ability to generate automatically the dependencies.txt file used by the SDK to build the component classpath. For more information on the configuration, refer to the Maven properties matching the attributes.

You can use it as follows:

buildscript {
  repositories {
    mavenLocal()
    mavenCentral()
  }
  dependencies {
    classpath "org.talend.sdk.component:gradle-talend-component:${talendComponentVersion}"
  }
}

apply plugin: 'org.talend.sdk.component'
apply plugin: 'java'

// optional customization
talendComponentKit {
    // dependencies.txt generation, replaces maven-dependency-plugin
    dependenciesLocation = "TALEND-INF/dependencies.txt"
    boolean skipDependenciesFile = false;

    // classpath for validation utilities
    sdkVersion = "${talendComponentVersion}"
    apiVersion = "${talendComponentApiVersion}"

    // documentation
    skipDocumentation = false
    documentationOutput = new File(....)
    documentationLevel = 2 // first level will be == in the generated adoc
    documentationTitle = 'My Component Family' // default to project name
    documentationFormats = [:] // adoc attributes
    documentationFormats = [:] // renderings to do

    // validation
    skipValidation = false
    validateFamily = true
    validateSerializable = true
    validateInternationalization = true
    validateModel = true
    validateOptionNames = true
    validateMetadata = true
    validateComponent = true
    validateDataStore = true
    validateDataSet = true
    validateActions = true

    // web
    serverArguments = []
    serverPort = 8080

    // car
    carOutput = new File(....)
    carMetadata = [:] // custom meta (string key-value pairs)
}

Services

Internationalizing services

Internationalization requires following several best practices:

Storing messages using ResourceBundle properties file in your component module.
The location of the properties is in the same package than the related components and is named Messages. For example, org.talend.demo.MyComponent uses org.talend.demo.Messages[locale].properties.
Use the internationalization API for your own messages.

Internationalization API

The Internationalization API is the mechanism to use to internationalize your own messages in your own components.

The principle of the API is to design messages as methods returning String values and get back a template using a ResourceBundle named Messages and located in the same package than the interface that defines these methods.

To ensure your internationalization API is identified, you need to mark it with the @Internationalized annotation:

@Internationalized (1)
public interface Translator {

    String message();

    String templatizedMessage(String arg0, int arg1); (2)

    String localized(String arg0, @Language Locale locale); (3)

    String localized(String arg0, @Language String locale); (4)
}

1	`@Internationalized` allows to mark a class as an internationalized service.
2	You can pass parameters. The message uses the `MessageFormat` syntax to be resolved, based on the `ResourceBundle` template.
3	You can use `@Language` on a `Locale` parameter to specify manually the locale to use. Note that a single value is used (the first parameter tagged as such).
4	`@Language` also supports the `String` type.

Providing actions for consumers

In some cases you can need to add some actions that are not related to the runtime. For example, enabling clients - the users of the plugin/library - to test if a connection works properly.

To do so, you need to define an @Action, which is a method with a name (representing the event name), in a class decorated with @Service:

@Service
public class MyDbTester {
    @Action(family = "mycomp", "test")
    public Status doTest(final IncomingData data) {
        return ...;
    }
}

Services are singleton. If you need some thread safety, make sure that they match that requirement. Services should not store any status either because they can be serialized at any time. Status are held by the component.

Services can be used in components as well (matched by type). They allow to reuse some shared logic, like a client. Here is a sample with a service used to access files:

@Emitter(family = "sample", name = "reader")
public class PersonReader implements Serializable {
    // attributes skipped to be concise

    public PersonReader(@Option("file") final File file,
                        final FileService service) {
        this.file = file;
        this.service = service;
    }

    // use the service
    @PostConstruct
    public void open() throws FileNotFoundException {
        reader = service.createInput(file);
    }

}

The service is automatically passed to the constructor. It can be used as a bean. In that case, it is only necessary to call the service method.

Particular action types

Some common actions need a clear contract so they are defined as API first-class citizen. For example, this is the case for wizards or health checks. Here is the list of the available actions:

API	Type	Description	Return type	Sample returned type
@org.talend.sdk.component.api.service.completion.DynamicValues	dynamic_values	Mark a method as being useful to fill potential values of a string option for a property denoted by its value. You can link a field as being completable using @Proposable(value). The resolution of the completion action is then done through the component family and value of the action. The callback doesn’t take any parameter.	Values	`{"items":[{"id":"value","label":"label"}]}`
@org.talend.sdk.component.api.service.healthcheck.HealthCheck	healthcheck	This class marks an action doing a connection test	HealthCheckStatus	`{"comment":"Something went wrong","status":"KO"}`
@org.talend.sdk.component.api.service.schema.DiscoverSchema	schema	Mark an action as returning a discovered schema. Its parameter MUST be the type decorated with `@Structure`.	Schema	`{"entries":[{"name":"column1","type":"STRING"}]}`
@org.talend.sdk.component.api.service.completion.Suggestions	suggestions	Mark a method as being useful to fill potential values of a string option. You can link a field as being completable using @Suggestable(value). The resolution of the completion action is then done when the user requests it (generally by clicking on a button or entering the field depending the environment).	SuggestionValues	`{"cacheable":false,"items":[{"id":"value","label":"label"}]}`
@org.talend.sdk.component.api.service.Action	user	-	any	-
@org.talend.sdk.component.api.service.asyncvalidation.AsyncValidation	validation	Mark a method as being used to validate a configuration. IMPORTANT: this is a server validation so only use it if you can’t use other client side validation to implement it.	ValidationResult	`{"comment":"Something went wrong","status":"KO"}`

Internationalization

Internationalization is supported through the injection of the $lang parameter, which allows you to get the correct locale to use with an @Internationalized service:

public SuggestionValues findSuggestions(@Option("someParameter") final String param,
                                        @Option("$lang") final String lang) {
    return ...;
}

You can combine the $lang option with the @Internationalized and @Language parameters.

Built-in services

The framework provides built-in services that you can inject by type in components and actions.

Here is the list:

Type Description

Type	Description
`org.talend.sdk.component.api.service.cache.LocalCache`	Provides a small abstraction to cache data that does not need to be recomputed very often. Commonly used by actions for UI interactions.
`org.talend.sdk.component.api.service.dependency.Resolver`	Allows to resolve a dependency from its Maven coordinates.
`javax.json.bind.Jsonb`	A JSON-B instance. If your model is static and you don’t want to handle the serialization manually using JSON-P, you can inject that instance.
`javax.json.spi.JsonProvider`	A JSON-P instance. Prefer other JSON-P instances if you don’t exactly know why you use this one.
`javax.json.JsonBuilderFactory`	A JSON-P instance. It is recommended to use this one instead of a custom one to optimize memory usage and speed.
`javax.json.JsonWriterFactory`	A JSON-P instance. It is recommended to use this one instead of a custom one to optimize memory usage and speed.
`javax.json.JsonReaderFactory`	A JSON-P instance. It is recommended to use this one instead of a custom one to optimize memory usage and speed.
`javax.json.stream.JsonParserFactory`	A JSON-P instance. It is recommended to use this one instead of a custom one to optimize memory usage and speed.
`javax.json.stream.JsonGeneratorFactory`	A JSON-P instance. It is recommended to use this one instead of a custom one to optimize memory usage and speed.
`org.talend.sdk.component.api.service.configuration.LocalConfiguration`	Represents the local configuration that can be used during the design.
`org.talend.sdk.component.api.service.dependency.Resolver`	Allows to resolve files from Maven coordinates (like `dependencies.txt` for component). Note that it assumes that the files are available in the component Maven repository.
`org.talend.sdk.component.api.service.injector.Injector`	Utility to inject services in fields marked with `@Service`.
`org.talend.sdk.component.api.service.factory.ObjectFactory`	Allows to instantiate an object from its class name and properties. It is not recommended to use it for the runtime because the local configuration is usually different and the instances are distinct. You can also use the local cache as an interceptor with `@Cached`
Every interface that extends `HttpClient` and that contains methods annotated with `@Request`	Lets you define an HTTP client in a declarative manner using an annotated interface. See the Using HttpClient for more details.

org.talend.sdk.component.api.service.cache.LocalCache

Provides a small abstraction to cache data that does not need to be recomputed very often. Commonly used by actions for UI interactions.

org.talend.sdk.component.api.service.dependency.Resolver

Allows to resolve a dependency from its Maven coordinates.

javax.json.bind.Jsonb

A JSON-B instance. If your model is static and you don’t want to handle the serialization manually using JSON-P, you can inject that instance.

javax.json.spi.JsonProvider

A JSON-P instance. Prefer other JSON-P instances if you don’t exactly know why you use this one.

javax.json.JsonBuilderFactory