Good Test Data

Data Formats

What types of formats do we offer, and why is that important to you!
Check out each extension type for more detailed information on how Good Test Data uses the respective format.

TSV JSON SQL PDF JPG

CSV XML XLS (Excel) TXT MP3 (Audio) MP4 (Video)

ZIP TGZ Other


TSV

Tab-separated values
Wikipedia Definition

Tab Separated Values (TSV) format is our preferred format simple data for a number of reasons.

  • It's a bit easier to read each column as it's clearly separated.
  • A tab character is not something you normally use, so it's requires less special handling for other characters including a comma (,) and quotes (") etc.
  • It is not as verbose as JSON and XML formats.
  • You can easily import a TSV format file into Excel or other spreadsheets just as you could import a CSV file.

Here is an example from our FREE list of 249 countries with Top Level Domains (TLD)

AW	Aruba		ABW	533
AU	Australia	AUS	036
AT	Austria		AUT	040
AZ	Azerbaijan	AZE	031
BS	Bahamas (the)	BHS	044

JSON

JavaScript Object Notation
Wikipedia Definition

JSON is the preferred format for structured data.

Here is an example from our FREE Blog Posts

{
    "article": {
        "id": "a000501",
        "title": "Ten Years Inside the Horse",
        "age": 1,
        "date": "12/30/2013",
        "author": "Anthony Berry",
        "thumb": "http://i.goodtestdata.com/i01/small/i00001.jpg",
        "image": "http://i.goodtestdata.com/i01/medium/i00001.jpg",
        "tags": [ "numbermr", "animal", "same", "eventually", "grew", "rocks", "base", "auto" ],
        "body": [
            "Of the common goat is of some importance. A brown or reddish brown goat retains the reddish cast at the base of the mohair much longer than one of a bluish or bluish black color. It is equally true that a pure white mother may drop a colored kid occasionally. In Constantinople the mohair is graded into parcels containing red kemp, black kemp, etc. There it is the kemp which retains the color. As has been stated, there is also a."
       ]
     }
}
      

SQL

Structured Query Language
Wikipedia Definition

SQL is a sytnax language for working with relational databases (RDBMS). For datasets we offer a table structure to load datasets into the MySQL database. Data can generally be loaded directly from the TSV format. For more complex datasets, such as blog posts we can also provide INSERT statements to populate a normalized data structure.

PDF

Portable Document Format
Wikipedia Definition

For our example attachments we use the standard PDF format for documents.

Here is an example from our FREE datasets.

JPG

Joint Photographic Expert Group (JPEG)
Wikipedia Definition

We use the JPG format when providingg images. Here is an example from our FREE images

.

CSV

Comma separated values

Historically Comma Separated values has been a format to provide column data in a text file. At Good Test Data we prefer to use TSV as this offers several advantages in processing.

XML

Extensible Markup Language

Wikipedia Definition

Good Test Data does not provide any data in XML format. We currently recommend and use JSON as a less verbose format.

Here is an example of what XML would like like with our FREE names data source.

<?xml version="1.0">
<people>
  <person>
    <name>John Smith</name>
    <email>john.smith@g42.com</email>
  </person>
</people>
      

XLS

The XLS format used by Microsoft Excel is a proprietary format used for the spreedsheet program. Excel enables easy importing of TSV files. A format we use for our common data sets.

TXT

When we provide a single column of information, we will use a simple TXT format rather then a TSV format.

MP3

The MP3 format is used for audio files. In future we plan to offer audio content in MP3 format.

MP4

The MP4 format is a popular video format. In future we plan to offer video format in MP4.

ZIP

All datasets will be made available in a ZIP archive. This is a popular format to combine multiples files and compress them to save on space.

TGZ

Our datasets are also available in TGZ or tar/gzip format. This is a popular Linux format for combining and compressing multiple files.

Other

Want us to format our data in a different format. Would you like us to load this into a WordPress blog, or other open source package. Contact Us for more information.