This anonymized data set encompasses 9 continuous months and represents 708,304,516 successful authentication events from users to computers collected from the Los Alamos National Laboratory (LANL) enterprise network.
Description
Each authentication event is on a separate line in the form of “time,user,computer” and represents a successful authentication by a user to a computer at the given time. The values are comma delimited.
As an example, here are the first 10 lines of the data set:
1,U1,C1
1,U1,C2
2,U2,C3
3,U3,C4
6,U4,C5
7,U4,C5
7,U5,C6
8,U6,C7
11,U7,C8
12,U8,C9
There are 11,362 users within the data set represented as U plus an anonymized, unique number, and 22,284 computers represented as C plus an anonymized, unique number. Timestamps, with a resolution of 1 second, start at an epoch 1 and all subsequent times are an offset from this epoch. The time frame of the actual data collection is not provided to enhance the anonymization of the data.
Some centralized computers (the Active Directory Servers) and the associated authentication events have been removed.
Data
The data is available both as as one single file with 708,304,516 text lines or 9 files each with 30 days of events. All of the files are compressed with the bzip2 compression algorithm (http://www.bzip.org/).
Citing
If you use this data in a publication please cite the following paper:
A. Hagberg, A. Kent, N. Lemons, and J. Neil, “Credential hopping in
authentication graphs,” in 2014 International Conference on Signal-Image
Technology Internet-Based Systems (SITIS). IEEE Computer Society, Nov.
2014.
@InProceedings{hagberg-2014-credential,
author = {Aric Hagberg and Alex Kent and Nathan Lemons and Joshua Neil},
title = {Credential hopping in authentication graphs},
year = 2014,
booktitle = {2014 International Conference on Signal-Image Technology Internet-Based Systems ({SITIS})},
month = {Nov.},
publisher = {IEEE Computer Society}
}
The data can be cited with the following:
A. D. Kent, “User-computer authentication associations in time,”
Los Alamos National Laboratory, http://dx.doi.org/10.11578/1160076, 2014.
@Misc{kent-2014-authdata,
author = {Alexander D. Kent},
title = {User-Computer Authentication Associations in Time},
year = {2014},
howpublished = {Los Alamos National Laboratory},
doi = {10.11578/1160076},
}
License
To the extent possible under law, Los Alamos National Laboratory has waived all copyright and related or neighboring rights to User-Computer Authentication Associations in Time. This work is published from: United States.
Notes
This data set and associated research have been approved by the LANL Human Subject Research Review Board under approval LANL 14-07 X and has been approved for public release under approval LA-UR-14-28318
Contact
For questions, feedback, or updates and future news related to events with regards to this dataset please send an e-mail to cyberdata@lanl.gov.