# Talend Component Testing Documentation

## Testing best practices

This section mainly concerns tools that can be used with JUnit. You can use most of these best practices with TestNG as well.

### Parameterized tests

Parameterized tests are a great solution to repeat the same test multiple times. This method of testing requires defining a test scenario (`I test function F`) and making the input/output data dynamic.

#### JUnit 4

Here is a test example, which validates a connection URI using `ConnectionService`:

``````public class MyConnectionURITest {
@Test
public void checkMySQL() {
assertTrue(new ConnectionService().isValid("jdbc:mysql://localhost:3306/mysql"));
}

@Test
public void checkOracle() {
assertTrue(new ConnectionService().isValid("jdbc:oracle:thin:@//myhost:1521/oracle"));
}
}``````

The testing method is always the same. Only values are changing. It can therefore be rewritten using JUnit `Parameterized` runner, as follows:

``````@RunWith(Parameterized.class) (1)
public class MyConnectionURITest {

@Parameterized.Parameters(name = "{0}") (2)
public static Iterable<String> uris() { (3)
return asList(
"jdbc:mysql://localhost:3306/mysql",
"jdbc:oracle:thin:@//myhost:1521/oracle");
}

@Parameterized.Parameter (4)
public String uri;

@Test
public void isValid() { (5)
assertNotNull(uri);
}
}``````
 1 `Parameterized` is the runner that understands `@Parameters` and how to use it. If needed, you can generate random data here. 2 By default the name of the executed test is the index of the data. Here, it is customized using the first `toString()` parameter value to have something more readable. 3 The `@Parameters` method must be static and return an array or iterable of the data used by the tests. 4 You can then inject the current data using the `@Parameter` annotation. It can take a parameter if you use an array of array instead of an iterable of object in `@Parameterized`. You can select which item you want to inject. 5 The `@Test` method is executed using the contextual data. In this sample, it gets executed twice with the two specified URIs.
 You don’t have to define a single `@Test` method. If you define multiple methods, each of them is executed with all the data. For example, if another test is added to the previous example, four tests are executed - 2 per data).

#### JUnit 5

With JUnit 5, parameterized tests are easier to use. The full documentation is available at junit.org/junit5/docs/current/user-guide/#writing-tests-parameterized-tests.

The main difference with JUnit 4 is that you can also define inline that the test method is a parameterized test as well as the values to use:

``````@ParameterizedTest
@ValueSource(strings = { "racecar", "radar", "able was I ere I saw elba" })
void mytest(String currentValue) {
// do test
}``````

However, you can still use the previous behavior with a method binding configuration:

``````@ParameterizedTest
@MethodSource("stringProvider")
void mytest(String currentValue) {
// do test
}

static Stream<String> stringProvider() {
return Stream.of("foo", "bar");
}``````

This last option allows you to inject any type of value - not only primitives - which is common to define scenarios.

 Add the `junit-jupiter-params` dependency to benefit from this feature.

## component-runtime-testing

### component-runtime-junit

`component-runtime-junit` is a test library that allows you to validate simple logic based on the Talend Component Kit tooling.

``````<dependency>
<groupId>org.talend.sdk.component</groupId>
<artifactId>component-runtime-junit</artifactId>
<version>${talend-component.version}</version> <scope>test</scope> </dependency>`````` This dependency also provides mocked components that you can use with your own component to create tests. The mocked components are provided under the `test` family: • `emitter` : a mock of an input component • `collector` : a mock of an output component #### JUnit 4 You can define a standard JUnit test and use the `SimpleComponentRule` rule: ``````public class MyComponentTest { @Rule (1) public final SimpleComponentRule components = new SimpleComponentRule("org.talend.sdk.component.mycomponent"); @Test public void produce() { Job.components() (2) .component("mycomponent","yourcomponentfamily://yourcomponent?"+createComponentConfig()) .component("collector", "test://collector") .connections() .from("mycomponent").to("collector") .build() .run(); final List<MyRecord> records = components.getCollectedData(MyRecord.class); (3) doAssertRecords(records); // depending your test } }``````  1 The rule creates a component manager and provides two mock components: an emitter and a collector. Set the root package of your component to enable it. 2 Define any chain that you want to test. It generally uses the mock as source or collector. 3 Validate your component behavior. For a source, you can assert that the right records were emitted in the mock collect.  The rule can also be defined as a `@ClassRule` to start it once per class and not per test as with `@Rule`. To go further, you can add the `ServiceInjectionRule` rule, which allows to inject all the component family services into the test class by marking test class fields with `@InjectService`: ``````public class SimpleComponentRuleTest { @ClassRule public static final SimpleComponentRule COMPONENT_FACTORY = new SimpleComponentRule("..."); @Rule (1) public final ServiceInjectionRule injections = new ServiceInjectionRule(COMPONENT_FACTORY, this); (2) @Service (3) private LocalConfiguration configuration; @Service private Jsonb jsonb; @Test public void test() { // ... } }``````  1 The injection requires the test instance, so it must be a `@Rule` rather than a `@ClassRule`. 2 The `ComponentsController` is passed to the rule, which for JUnit 4 is the `SimpleComponentRule`, as well as the test instance to inject services in. 3 All service fields are marked with `@Service` to let the rule inject them before the test is ran. #### JUnit 5 The JUnit 5 integration is very similar to JUnit 4, except that it uses the JUnit 5 extension mechanism. The entry point is the `@WithComponents` annotation that you add to your test class, and which takes the component package you want to test. You can use `@Injected` to inject an instance of `ComponentsHandler` - which exposes the same utilities than the JUnit 4 rule - in a test class field : ``````@WithComponents("org.talend.sdk.component.junit.component") (1) public class ComponentExtensionTest { @Injected (2) private ComponentsHandler handler; @Test public void manualMapper() { final Mapper mapper = handler.createMapper(Source.class, new Source.Config() { { values = asList("a", "b"); } }); assertFalse(mapper.isStream()); final Input input = mapper.create(); assertEquals("a", input.next()); assertEquals("b", input.next()); assertNull(input.next()); } }``````  1 The annotation defines which components to register in the test context. 2 The field allows to get the handler to be able to orchestrate the tests.  If you use JUnit 5 for the first time, keep in mind that the imports changed and that you need to use `org.junit.jupiter.api.Test` instead of `org.junit.Test`. Some IDE versions and `surefire` versions can also require you to install either a plugin or a specific configuration. As for JUnit 4, you can go further by injecting test class fields marked with `@InjectService`, but there is no additional extension to specify in this case: ``````@WithComponents("...") class ComponentExtensionTest { @Service (1) private LocalConfiguration configuration; @Service private Jsonb jsonb; @Test void test() { // ... } }``````  1 All service fields are marked with `@Service` to let the rule inject them before the test is ran. #### Mocking the output Using the "test"/"collector" component as shown in the previous sample stores all records emitted by the chain (typically your source) in memory. You can then access them using `theSimpleComponentRule.getCollectedData(type)`. Note that this method filters by type. If you don’t need any specific type, you can use `Object.class`. #### Mocking the input The input mocking is symmetric to the output. In this case, you provide the data you want to inject: ``````public class MyComponentTest { @Rule public final SimpleComponentRule components = new SimpleComponentRule("org.talend.sdk.component.mycomponent"); @Test public void produce() { components.setInputData(asList(createData(), createData(), createData())); (1) Job.components() .component("emitter","test://emitter") .component("out", "yourcomponentfamily://myoutput?"+createComponentConfig()) .connections() .from("emitter").to("out") .build .run(); assertMyOutputProcessedTheInputData(); } }``````  1 using `setInputData`, you prepare the execution(s) to have a fake input when using the "test"/"emitter" component. #### Creating runtime configuration from component configuration The component configuration is a POJO (using `@Option` on fields) and the runtime configuration (`ExecutionChainBuilder`) uses a `Map<String, String>`. To make the conversion easier, the JUnit integration provides a `SimpleFactory.configurationByExample` utility to get this map instance from a configuration instance. Example: ``````final MyComponentConfig componentConfig = new MyComponentConfig(); componentConfig.setUser("...."); // .. other inits final Map<String, String> configuration = configurationByExample(componentConfig);`````` The same factory provides a fluent DSL to create the configuration by calling `configurationByExample` without any parameter. The advantage is to be able to convert an object as a `Map<String, String>` or as a query string in order to use it with the `Job` DSL: ``````final String uri = "family://component?" + configurationByExample().forInstance(componentConfig).configured().toQueryString();`````` It handles the encoding of the URI to ensure it is correctly done. #### Testing a Mapper The `SimpleComponentRule` also allows to test a mapper unitarily. You can get an instance from a configuration and execute this instance to collect the output. Example ``````public class MapperTest { @ClassRule public static final SimpleComponentRule COMPONENT_FACTORY = new SimpleComponentRule( "org.company.talend.component"); @Test public void mapper() { final Mapper mapper = COMPONENT_FACTORY.createMapper(MyMapper.class, new Source.Config() {{ values = asList("a", "b"); }}); assertEquals(asList("a", "b"), COMPONENT_FACTORY.collectAsList(String.class, mapper)); } }`````` #### Testing a Processor As for a mapper, a processor is testable unitary. However, this case can be more complex in case of multiple inputs or outputs. Example ``````public class ProcessorTest { @ClassRule public static final SimpleComponentRule COMPONENT_FACTORY = new SimpleComponentRule( "org.company.talend.component"); @Test public void processor() { final Processor processor = COMPONENT_FACTORY.createProcessor(Transform.class, null); final SimpleComponentRule.Outputs outputs = COMPONENT_FACTORY.collect(processor, new JoinInputFactory().withInput("__default__", asList(new Transform.Record("a"), new Transform.Record("bb"))) .withInput("second", asList(new Transform.Record("1"), new Transform.Record("2"))) ); assertEquals(2, outputs.size()); assertEquals(asList(2, 3), outputs.get(Integer.class, "size")); assertEquals(asList("a1", "bb2"), outputs.get(String.class, "value")); } }`````` The rule allows you to instantiate a `Processor` from your code, and then to `collect` the output from the inputs you pass in. There are two convenient implementations of the input factory: 1. `MainInputFactory` for processors using only the default input. 2. `JoinInputfactory` with the `withInput(branch, data)` method for processors using multiple inputs. The first argument is the branch name and the second argument is the data used by the branch.  If needed, you can also implement your own input representation using `org.talend.sdk.component.junit.ControllableInputFactory`. ### component-runtime-testing-spark The following artifact allows you to test against a Spark cluster: ``````<dependency> <groupId>org.talend.sdk.component</groupId> <artifactId>component-runtime-testing-spark</artifactId> <version>${talend-component.version}</version>
<scope>test</scope>
</dependency>``````

#### JUnit 4

The testing relies on a JUnit `TestRule`. It is recommended to use it as a `@ClassRule`, to make sure that a single instance of a Spark cluster is built. You can also use it as a simple `@Rule`, to create the Spark cluster instances per method instead of per test class.

The `@ClassRule` takes the Spark and Scala versions to use as parameters. It then forks a master and N slaves. Finally, the `submit*` method allows you to send jobs either from the test classpath or from a shade if you run it as an integration test.

For example:

``````public class SparkClusterRuleTest {

@ClassRule
public static final SparkClusterRule SPARK = new SparkClusterRule("2.10", "1.6.3", 1);

@Test
public void classpathSubmit() throws IOException {
SPARK.submitClasspath(SubmittableMain.class, getMainArgs());

// wait for the test to pass
}
}``````
 This testing methodology works with `@Parameterized`. You can submit several jobs with different arguments and even combine it with Beam `TestPipeline` if you make it `transient`.

#### JUnit 5

The integration of that Spark cluster logic with JUnit 5 is done using the `@WithSpark` marker for the extension. Optionally, it allows you to inject—through `@SparkInject`—the `BaseSpark<?>` handler to access the Spark cluster meta information. For example, its host/port.

Example
``````@WithSpark
class SparkExtensionTest {

@SparkInject
private BaseSpark<?> spark;

@Test
void classpathSubmit() throws IOException {
final File out = new File(jarLocation(SparkClusterRuleTest.class).getParentFile(), "classpathSubmitJunit5.out");
if (out.exists()) {
out.delete();
}
spark.submitClasspath(SparkClusterRuleTest.SubmittableMain.class, spark.getSparkMaster(), out.getAbsolutePath());

await().atMost(5, MINUTES).until(
() -> out.exists() ? Files.readAllLines(out.toPath()).stream().collect(joining("\n")).trim() : null,
equalTo("b -> 1\na -> 1"));
}
}``````

#### Checking the job execution status

Currently, `SparkClusterRule` does not allow to know when a job execution is done, even by exposing and polling the web UI URL to check. The best solution at the moment is to make sure that the output of your job exists and contains the right value.

`awaitability` or any equivalent library can help you to implement such logic:

``````<dependency>
<groupId>org.awaitility</groupId>
<artifactId>awaitility</artifactId>
<version>3.0.0</version>
<scope>test</scope>
</dependency>``````

To wait until a file exists and check that its content (for example) is the expected one, you can use the following logic:

``````await()
.atMost(5, MINUTES)
.until(
() -> out.exists() ? Files.readAllLines(out.toPath()).stream().collect(joining("\n")).trim() : null,
equalTo("the expected content of the file"));``````

### component-runtime-http-junit

The HTTP JUnit module allows you to mock REST API very simply. The module coordinates are:

``````<dependency>
<groupId>org.talend.sdk.component</groupId>
<artifactId>component-runtime-http-junit</artifactId>
<version>\${talend-component.version}</version>
<scope>test</scope>
</dependency>``````
 This module uses Apache Johnzon and Netty. If you have any conflict (in particular with Netty), you can add the `shaded` classifier to the dependency. This way, both dependencies are shaded, which avoids conflicts with your component.

It supports both JUnit 4 and JUnit 5. The concept is the exact same one: the extension/rule is able to serve precomputed responses saved in the classpath.

You can plug your own `ResponseLocator` to map a request to a response, but the default implementation - which should be sufficient in most cases - looks in `talend/testing/http/<class name>_<method name>.json`. Note that you can also put it in `talend/testing/http/<request path>.json`.

#### JUnit 4

JUnit 4 setup is done through two rules:

• `JUnit4HttpApi`, which is starts the server.

• `JUnit4HttpApiPerMethodConfigurator`, which configures the server per test and also handles the capture mode.

 If you don’t use the `JUnit4HttpApiPerMethodConfigurator`, the capture feature is disabled and the per test mocking is not available.
Test example
``````public class MyRESTApiTest {
@ClassRule
public static final JUnit4HttpApi API = new JUnit4HttpApi();

@Rule
public final JUnit4HttpApiPerMethodConfigurator configurator = new JUnit4HttpApiPerMethodConfigurator(API);

@Test
public void direct() throws Exception {
}
}``````
##### SSL

For tests using SSL-based services, you need to use `activeSsl()` on the `JUnit4HttpApi` rule.

You can access the client SSL socket factory through the API handler:

``````@ClassRule
public static final JUnit4HttpApi API = new JUnit4HttpApi().activeSsl();

@Test
public void test() throws Exception {
final HttpsURLConnection connection = getHttpsConnection();
connection.setSSLSocketFactory(API.getSslContext().getSocketFactory());
// ....
}``````

#### JUnit 5

JUnit 5 uses a JUnit 5 extension based on the `HttpApi` annotation that you can add to your test class. You can inject the test handler - which has some utilities for advanced cases - through `@HttpApiInject`:

``````@HttpApi
class JUnit5HttpApiTest {
@HttpApiInject
private HttpApiHandler<?> handler;

@Test
void getProxy() throws Exception {
}
}``````
 The injection is optional and the `@HttpApi` annotation allows you to configure several test behaviors.
##### SSL

For tests using SSL-based services, you need to use `@HttpApi(useSsl = true)`.

You can access the client SSL socket factory through the API handler:

``````@HttpApi*(useSsl = true)*
class MyHttpsApiTest {
@HttpApiInject
private HttpApiHandler<?> handler;

@Test
void test() throws Exception {
final HttpsURLConnection connection = getHttpsConnection();
connection.setSSLSocketFactory(handler.getSslContext().getSocketFactory());
// ....
}
}``````

#### Capturing mode

The strength of this implementation is to run a small proxy server and to auto-configure the JVM: `http[s].proxyHost`, `http[s].proxyPort`, `HttpsURLConnection#defaultSSLSocketFactory` and `SSLContext#default` are auto-configured to work out-of-the-box with the proxy.

It allows you to keep the native and real URLs in your tests. For example, the following test is valid:

``````public class GoogleTest {
@ClassRule
public static final JUnit4HttpApi API = new JUnit4HttpApi();

@Rule
public final JUnit4HttpApiPerMethodConfigurator configurator = new JUnit4HttpApiPerMethodConfigurator(API);

@Test
public void google() throws Exception {
}

private int get(final String uri) throws Exception {
// do the GET request, skipped for brievity
}
}``````

If you execute this test, it fails with an HTTP 400 error because the proxy does not find the mocked response.
You can create it manually, as described in component-runtime-http-junit, but you can also set the `talend.junit.http.capture` property to the folder storing the captures. It must be the root folder and not the folder where the JSON files are located (not prefixed by `talend/testing/http` by default).

In most cases, use `src/test/resources`. If `new File("src/test/resources")` resolves the valid folder when executing your test (Maven default), then you can just set the system property to `true`. Otherwise, you need to adjust accordingly the system property value.

When the tests run with this system property, the testing framework creates the correct mock response files. After that, you can remove the system property. The tests will still pass, using `google.com`, even if you disconnect your machine from the Internet.

#### Passthrough mode

If you set the `talend.junit.http.passthrough` system property to `true`, the server acts as a proxy and executes each request to the actual server - similarly to the capturing mode.

## Beam testing

If you want to make sure that your component works in Beam and don’t want to use Spark, you can try with the Direct Runner.

Check beam.apache.org/contribute/testing/ for more details.

## Testing on multiple environments

JUnit (4 or 5) already provides ways to parameterize tests and execute the same "test logic" against several sets of data. However, it is not very convenient for testing multiple environments.

For example, with Beam, you can test your code against multiple runners. But it requires resolving conflicts between runner dependencies, setting the correct classloaders, and so on.

To simplify such cases, the framework provides you a multi-environment support for your tests, through the JUnit module, which works with both JUnit 4 and JUnit 5.

### JUnit 4

``````@RunWith(MultiEnvironmentsRunner.class)
@Environment(Env1.class)
@Environment(Env2.class)
public class TheTest {
@Test
public void test1() {
// ...
}
}``````

The `MultiEnvironmentsRunner` executes the tests for each defined environments. With the example above, it means that it runs `test1` for `Env1` and `Env2`.

By default, the `JUnit4` runner is used to execute the tests in one environment, but you can use `@DelegateRunWith` to use another runner.

### JUnit 5

The multi-environment configuration with JUnit 5 is similar to JUnit 4:

``````@Environment(EnvironmentsExtensionTest.E1.class)
@Environment(EnvironmentsExtensionTest.E2.class)
class TheTest {

@EnvironmentalTest
void test1() {
// ...
}
}``````

The main differences are that no runner is used because they do not exist in JUnit 5, and that you need to replace `@Test` by `@EnvironmentalTest`.

 With JUnit5, tests are executed one after another for all environments, while tests are ran sequentially in each environments with JUnit 4. For example, this means that `@BeforeAll` and `@AfterAll` are executed once for all runners.

### Provided environments

The provided environment sets the contextual classloader in order to load the related runner of Apache Beam.

Package: `org.talend.sdk.component.junit.environment.builtin.beam`

 the configuration is read from system properties, environment variables, …​.
Class Name Description

ContextualEnvironment

Contextual

Contextual runner

DirectRunnerEnvironment

Direct

Direct runner

SparkRunnerEnvironment

Spark

Spark runner

### Configuring environments

If the environment extends `BaseEnvironmentProvider` and therefore defines an environment name - which is the case of the default ones - you can use `EnvironmentConfiguration` to customize the system properties used for that environment:

``````@Environment(DirectRunnerEnvironment.class)
@EnvironmentConfiguration(
environment = "Direct",
systemProperties = @EnvironmentConfiguration.Property(key = "beamTestPipelineOptions", value = "..."))

@Environment(SparkRunnerEnvironment.class)
@EnvironmentConfiguration(
environment = "Spark",
systemProperties = @EnvironmentConfiguration.Property(key = "beamTestPipelineOptions", value = "..."))

@EnvironmentConfiguration(
systemProperties = @EnvironmentConfiguration.Property(key = "beamTestPipelineOptions", value = "..."))
class MyBeamTest {

@EnvironmentalTest
void execute() {
// run some pipeline
}
}``````
 If you set the `.skip` system property to `true`, the environment-related executions are skipped.

This usage assumes that Beam 2.4.0 is used.

The following dependencies bring the JUnit testing toolkit, the Beam integration and the multi-environment testing toolkit for JUnit into the test scope.

Dependencies
``````<dependencies>
<dependency>
<groupId>org.talend.sdk.component</groupId>
<artifactId>component-runtime-junit</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.jboss.shrinkwrap.resolver</groupId>
<artifactId>shrinkwrap-resolver-impl-maven</artifactId>
<version>3.1.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.talend.sdk.component</groupId>
<artifactId>component-runtime-beam</artifactId>
<scope>test</scope>
</dependency>
</dependencies>``````

Using the fluent DSL to define jobs, you can write a test as follows:

 Your job must be linear and each step must send a single value (no multi-input or multi-output).
``````@Environment(ContextualEnvironment.class)
@Environment(DirectRunnerEnvironment.class)
class TheComponentTest {
@EnvironmentalTest
void testWithStandaloneAndBeamEnvironments() {
from("myfamily://in?config=xxxx")
.to("myfamily://out")
.create()
.execute();
// add asserts on the output if needed
}
}``````

It executes the chain twice:

1. With a standalone environment to simulate the Studio.

2. With a Beam (direct runner) environment to ensure the portability of your job.

You can reuse Maven `settings.xml` server files, including the encrypted ones. `org.talend.sdk.component.maven.MavenDecrypter` allows yo to find a `username`/`password` from a server identifier:

``````final MavenDecrypter decrypter = new MavenDecrypter();
final Server decrypted = decrypter.find("my-test-server");

It is very useful to avoid storing secrets and to perform tests on real systems on a continuous integration platform.

 Even if you don’t use Maven on the platform, you can generate the `settings.xml` and`settings-security.xml` files to use that feature. See maven.apache.org/guides/mini/guide-encryption.html for more details.

## Generating data

Several data generators exist if you want to populate objects with a semantic that is more evolved than a plain random string like `commons-lang3`:

Even more advanced, the following generators allow to directly bind generic data on a model. However, data quality is not always optimal:

There are two main kinds of implementation:

• Implementations using a pattern and random generated data.

• Implementations using a set of precomputed data extrapolated to create new values.

Check your use case to know which one fits best.

 An alternative to data generation can be to import real data and use Talend Studio to sanitize the data, by removing sensitive information and replacing it with generated or anonymized data. Then you just need to inject that file into the system.

If you are using JUnit 5, you can have a look at glytching.github.io/junit-extensions/randomBeans.

Scroll to top