02/04/16 Firefly dataset discussion

On 02/04/16 Thursday afternoon, Andrew, Logan, Boha and I (Saisi) got together in the conference room and talked our firefly dataset (https://figshare.com/articles/LTER_Lampyrid_data/2068098) for the first time. The discussion was productive, and we obtained: 1. the metadata  and DRP (Data Reuse Plan) information of the dataset. 2.Hypothesis on data analysis and paper drafting, and questions for Christie on how we can dig into the data deeper.

1. Data Reuse Plan Worksheet and Metadata

Variate Description Units
Sample Date sample date
Treatment treatment ID
Replicate replicate within the treatment
Station sampling station within each plot
Species scientific name of the species in the trap
Family family that the species belongs to
Order order that the species belongs to
Adults number of adults of that species that was in the trap at sampling time number
location utm location of the trap in utm zone 16N meter
Year sample year

PROJECT LEVEL:

What
Project description (abstract): Firefly numbers in Kellogg observation station from 2004-2015
 
Data set title (e.g. “Data from: ”, “Soil moisture data in Columbia Delta 1982”): LTER_lampyrid_data_20042015
 
Permanent ID (PID types include: DOI, PURL, ARK, handl, etc.): Unknown
 
Sources of data (if someone else’s data is included in your data set; preferably use a permanent identifier if available): KBS longer term ecological research site
 
Subject area (e.g. Neurological biochemistry, applied ecology, etc.): Entomology
 
Related research publication (include full citation and permanent identifier, if available): Christie’s Ladybug publicationhttp://link.springer.com/article/10.1007%2Fs10530-014-0772-4

Who

Person/organization responsible for collecting data: Christie and her colleagues
Sponsoring or funding agency, grant number, and PI name/s & affiliations: GLBRC?
Collaborators (if applicable): ?
 

Contact person, their affiliation and contact info for questions about the data: Christie

Where

Location where data was collected (use geographic coordinates if appropriate): KBS LTER Main Site http://lter.kbs.msu.edu/maps/images/current-lter-plot-map.pdf
Place of publication (e.g. institution or repository where data is made available): PeerJ

When

Dates of collection (specific date, date range): 2004-2015
 
Date of publication (when data was made publicly available): By the end of 2016 spring

How

Data collection process (what instruments were used to collect the data? how frequently were the data collected? how were data collection sites selected? if there was a sample population, how was it selected?): Sticky tags in KBS main station
 

Data processing description (how did you clean the data? how are null values handled? did you write code for processing the data and where can it be found?)

FILE LEVEL:

File format (are there multiple formats? what software is needed to use the file/s?) NB: Avoid proprietary formats if possible!  cvs file
File structure (if more than one file in dataset; include folder and file index, naming conventions, README files; provides context): 
DATE TREAT_DESC HABITAT REPLICATE STATION ADULTS
Survey instruments (if any, include permanent identifier that points to the instrument if not included in files): Sticky tags
Field names and definitions (include units of measurement, formulas used for calculation, explain abbreviations): KBS LTER
 
 

2. Hypothesis and questions:

Preliminary test: By sorting the data and plotting firefly population with different variables (Logan is a quick plotter!), we find correlation between population with year, tilling and landscape.
Therefore, we had the hypothesis that population of fireflies is dependent on hab type + year + organic/not +  (weather + temperature). And to decouple the effects of diffrent factors, we may take snapshot of each year to study habitat type and treatment (organic, tilling, etc.,).
Questions remain: Do we need a model of population (dependent variable) depends on year and habitat types (two most significant independent variable)? If yes, is there an empirical model we can get from previous papers? What parameters should we estimate? How many variables should we consider? If we are not going to develop a mathematic model, what statistical analysis should we run?
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s