|
PydicomUserGuide
pydicom guide -- object model, description of classes, examples
Featured DatasetDataset is the base object in pydicom's object model. The relationship between Dataset and other objects is: Dataset (derived from python's dict) ---> contains DataElement instances--> the value of the data element can be one of: Dataset is the main object you will work with directly. Dataset is derived from python's dict, so it inherits (and overrides some of) the methods of dict. In other words it is a collection of key:value pairs, where the key value is the DICOM (group,element) tag (as a Tag object, described below), and the value is a DataElement instance (also described below). A dataset could be created directly, but you will usually get one by reading an existing DICOM file: >>> import dicom
>>> ds = dicom.read_file("rtplan.dcm") # (rtplan.dcm is in the testfiles directory)You can display the entire dataset by simply printing its string (str or repr) value: >>> ds (0008, 0012) Instance Creation Date DA: '20030903' (0008, 0013) Instance Creation Time TM: '150031' (0008, 0016) SOP Class UID UI: RT Plan Storage (0008, 0018) SOP Instance UID UI: 1.2.777.777.77.7.7777.7777.20030903150023 (0008, 0020) Study Date DA: '20030716' (0008, 0030) Study Time TM: '153557' (0008, 0050) Accession Number SH: '' (0008, 0060) Modality CS: 'RTPLAN' ... You can also view DICOM files in a collapsible tree using the example program dicomtree.py. A brief aside: pydicom no longer includes the file meta information (group 2) in the main dataset (they are in fact two separate datasets in the DICOM standard). That dataset is now stored in the file_meta attribute of the dataset:>>> ds.file_meta (0002, 0001) File Meta Information Version OB: '\x00\x01' (0002, 0002) Media Storage SOP Class UID UI: RT Plan Storage (0002, 0003) Media Storage SOP Instance UID UI: 1.2.999.999.99.9.9999.9999.20030903150023 (0002, 0010) Transfer Syntax UID UI: Implicit VR Little Endian (0002, 0012) Implementation Class UID UI: 1.2.888.888.88.8.8.8 You can access specific data elements in a dataset by name or by DICOM tag number: >>> ds.PatientsName 'Last^First^mid^pre' >>> ds[0x10,0x10].value 'Last^First^mid^pre' In the latter case (using the tag number directly) a DataElement instance is returned, so the .value must be used to get the value. You can also set values by name or tag number: >>> ds.PatientID = "12345" >>> ds.SeriesNumber = 5 >>> ds[0x10,0x10].value = 'Test' The use of names is possible because pydicom intercepts requests for member variables, and checks if they are in the DICOM dictionary. It translates the name to a (group,element) number and returns the corresponding value for that key if it exists. The names are the descriptive text from the dictionary with spaces and apostrophes, etc. removed. DICOM Sequences are turned into python lists. For these, the name is from the dictionary name with "sequence" removed, and the normal English plural added. So "Beam Sequence" becomes "Beams", "Referenced Film Box Sequence" becomes "ReferencedFilmBoxes". Items in the sequence are referenced by number, beginning at index 0 as per python convention. >>> ds.Beams[0].BeamName
'Field 1'
>>> # Same thing with tag numbers (not as pretty!):
>>> ds[0x300a,0xb0][0][0x300a,0xc2].value
'Field 1'
>>> # yet another way, using another variable
>>> beam1=ds.Beams[0]
>>> beam1.BeamName, beam1[0x300a,0xc2].value
('Field 1', 'Field 1')See WorkingWithSequences for more details about pydicom and sequences, including creating them and altering them. Since you may not always remember the exact name of a data element, Dataset provides a handy dir() method, useful during interactive sessions at the python prompt: >>> ds.dir("pat")
['PatientSetups', 'PatientsBirthDate', 'PatientsID', 'PatientsName', 'PatientsSex']dir will return any DICOM tag names in the dataset that have the specified string anywhere in the name (case insensitive). Calling dir with no string will list all tag names available in the dataset. You can also see all the names that pydicom knows about by viewing the _dicom_dict.py file. You could modify that file to add tags that pydicom doesn't already know about. Under the hood, Dataset stores a DataElement object for each item, but when accessed by name (e.g. ds.PatientsName) only the value of that DataElement is returned. If you need the whole DataElement (see the DataElement class discussion), you can use Dataset's data_element() method or access the item using the tag number: >>> data_element = ds.data_element("PatientsName") # or data_element = ds[0x10,0x10]
>>> data_element.VR, data_element.value
('PN', 'Last^First^mid^pre')To check for the existence of a particular tag before using it, use the in keyword: >>> "PatientsName" in ds True To remove a data element from the dataset, use del: >>> del ds[0x10,0x1000]
>>> # OR
>>> tag = ds.data_element("OtherPatientIDs").tag
>>> del ds[tag]To work with pixel data, the raw bytes are available through the usual tag: >>> pixel_bytes = ds.PixelData but to work with them in a more intelligent way, use pixel_array (requires the NumPy library): >>> pix = ds.pixel_array For more details, see WorkingWithPixelData. DataElementThe DataElement class is not usually used directly in user code, but is used extensively by Dataset. DataElement is a simple object which stores the following things:
TagThe Tag class is derived from python's long, so in effect, it is just a number with some extra behaviour:
>>> from dicom.tag import Tag
>>> t1=Tag(0x00100010) # all of these are equivalent
>>> t2=Tag(0x10,0x10)
>>> t3=Tag((0x10, 0x10))
>>> t1
(0010, 0010)
>>> t1==t2, t1==t3
(True, True)
SequenceSequence is derived from python's list. The only added functionality is to make string representations prettier. Otherwise all the usual methods of list like item selection, append, etc. are available. For the most part, sequences in pydicom work transparently. But if you need more details, especially if creating your own sequence, see WorkingWithSequences. |
Really great, thats what i looked for all the time - pure python dicom reading
thanks!
Thank you very much!
Excellent class. Great job guys. A question: what is the most efficient way to use dicom.ReadFile? if I'm not going to touch the actual pixel data? I noticed that when I used a small "defer_size" (512 KB) some of the dicom Tags where not available for reading. I have a script that just reads header information and sorts the dicoms accordingly. Thanks!
@skanterakis: If you don't want to read pixel data, the simplest way in pydicom 0.9.4 is to use read_file(..., stop_before_pixels=True), which saves time and memory. The defer_size argument saves only memory -- it still parses the entire file, it just doesn't load the big items into memory until they are accessed.
If speed is critical, you might also want to look at the time_test.py script in the test/performance subdirectory. You could edit that to run timing tests with your own DICOM files to assess what works best.
Can this package generate a valid SOP Instance UID when outputting the dataset to a dicom file? If not, is there other way to do it? Thanks!
@zqian11: pydicom doesn't handle the details of UIDs. You can of course set any value in a dataset, including UIDs, but the code using pydicom has to decide the value. If you need a UID root, see the comment in UID.py, which has a link to Medical Connections, which kindly offers them.
Thank you for this marvellous software. It works very smoothly.
I need to access a manufacturer specific nested sequence, therefore there is no dictionary name. The sequence is not further un-nested and I can see something like this:
(2005, 140f) Private tag data OB: Array of 1104 bytes
Question is: how do I access the information that is in there?
Thanks
I have a dicom template file but it lacks certain datasets. how do you add a new dataset?
@markjanbangoy: Do you mean adding new data elements to a dataset? If so, then you can add new elements to a dataset ds like this: ds.PatientsName? = "Last^First" # just use the dictionary name if it the data element is in the DICOM standard, or ds.AddNew?(tag, VR, value) # if the data element does not exist in the DICOM dictionary. (this method will soon be renamed to add_new() for PEP-8 compliance)
I am trying to add a new tag to an existing DICOM file. The information for the tag ( [0x0010,0x1000] "Other Patient IDs") exists in the dictionary. Based upon the suggestion above, I've tried the line "ds.OtherPatientIDs.value = new_value" but it fails with an error that the dataset does not have this attribute. If I try to access it directly "ds[0x0010,0x1000].value = new_value", the function displays the hex tag and exits. Any suggestions?
Darcy, you rock my world! Thanks a lot for this lib!
Guys, is it possible to convert a DICOM format to jpg/gif/png format with pydicom? what is the command to view Dicom images?
It should be possible to convert to png etc from PIL using im.save(). I don't know the details though, you may have to set some other parameters.
For viewing, see the ViewingImages wiki page, and the contrib folder in the source; there are example modules for using PIL and Tkinter there.
This is amazing! Do you guys have a good sorting class yet, for sorting a stack of dicoms in a directory??
@tylercloke, the contributed file pydicom_series.py collects images in a series. May be close to what you are looking for.
Is there a convenient way to get bvevs/bvals from DTI DICOM data using pydicom?
>> Is there a convenient way to get bvevs/bvals from DTI DICOM data using pydicom? For others coming across this, I see there are answers to this on the nitrc site, e.g. http://www.nitrc.org/forum/message.php?msg_id=5538 gives an answer and points to code using pydicom.